We are on site and have restored all services. We discovered that our router was not set to power on after AC recovery. This meant when the 3rd party came on site and powered on our UPS, everything except the router came online. Our hypervisor and switching gear was actually online.
We will be looking at the circumstances of this incident closely. Although we provide no SLA for hosted services and projects running on-prem, this period of downtime is not acceptable to us. We will be following up with the power utility about the recent instability, as well as creating better protocols/procedures for when our staff is not on site for incidents like this. This will include, but not be limited to, more explicit instructions for rebooting the site which will include verifying all hardware is powered on.
We apologize for the delay and inconvenience this outage caused. Thank you for your patience.
Posted May 26, 2024 - 15:30 EDT
Update
Our personnel will be returning to the on premise site this afternoon, upon which we will investigate the situation.
We expect some delay in returning to service due to networking equipment being offline for some time.
Posted May 26, 2024 - 08:15 EDT
Update
We are still continuing to work on this issue. We are uncertain at this point whether the on site 3rd party will make another attempt to restart equipment, as the first attempt did not succeed. We may have to wait until we are back on site on the 26th.
Posted May 25, 2024 - 15:07 EDT
Update
Unfortunately these efforts are not proving successful at the moment. We’re working to ascertain whether there is an ISP problem or other issues.
Posted May 25, 2024 - 00:09 EDT
Update
On site personnel are actively working to restore equipment.
Posted May 24, 2024 - 23:24 EDT
Update
We’re seeing indications from the utility that power has been restored. We are awaiting on site personnel to restore equipment.
Posted May 24, 2024 - 22:49 EDT
Update
We are still awaiting an estimate from the local utility. The ISP has quoted a restoration before 11:05pm Eastern Time, but we are not sure if this is based upon information from the utility or not.
Posted May 24, 2024 - 21:10 EDT
Update
The local utility has acknowledged the outage, but has no estimate while assessing the condition. We will provide updates as given to us by the utility and updates we have regarding restoration of service.
Additionally, we will be in contact with the utility regarding further measures as the incoming power has become increasingly unreliable.
Posted May 24, 2024 - 20:16 EDT
Identified
The on premise site is experiencing another power outage that has run beyond our UPS capacity. The site has been gracefully shut down.
Hosted services will be unavailable during this time. Our tertiary MX is down but as the higher priority MXs are functional, no impact to Petalpost is expected.
Posted May 24, 2024 - 20:08 EDT
This incident affected: Hosted Services (GitLab, Grafana/Prometheus Monitoring) and Hosts/PoPs (Eveland - Elkhart, IN, GitLab - Elkhart, IN, Kaneshiro - Elkhart, IN, Penrose - Elkhart, IN).