OVHcloud Private Cloud Status

Current status
Legend
  • Operational
  • Degraded performance
  • Partial Outage
  • Major Outage
  • Under maintenance
FS#11706 — pcc-27-n5
Incident Report for Hosted Private Cloud
Resolved
Several FEX are down on this switch. As the pcc-26 is still configuring, certain hosts are down.

Update(s):

Date: 2014-09-29 09:10:20 UTC
More details on this afternoon's downtime (approx. 18:30 Paris time):

Following hardware issues (fans) on the pcc-26 this morning, we replaced it with the spare and the service was maintained by only the pcc-27. Synchronisation of the configuration took a few hours, which is normal. However, one of the resync scripts seemed to have caused a CPU load peak on the pcc-27 (process ethpm). The consequence is that the pcc-27 ended up losing connection with the FEXs. At that time, around 18:15, we had an isolated, reconfiguring pcc-26 and a pcc-27 cut of from the FEX. The two hosts connected to this pair were cut off - this caused downtime until the pcc-27 came back after a forced reboot around 19:00. Only from this time did the hosts begin to remount.

We are currently finishing to get the pcc-26 back up so that this pair is completely redundant.

Date: 2014-09-29 09:02:01 UTC
There's no longer an issue with the switch. The configuration is now normalised.

Date: 2014-09-29 09:01:31 UTC
4 FEX out of 13 are down on the pcc-27-n5 following the peak load of a process.

As the situation can not be remedied at this level, we've forced the pcc-27 to reload to remount the FEX. All the FEX are now up and the switch is running the configuration from 16:36. We will redo the changes from that.

The network is stable again. The team will work on remounting the hosts.
Posted Sep 29, 2014 - 08:58 UTC