Austin wrote:
We need an update. It has been 6 hours.
Here's everything I know about the situation so far... PROBLEM SUMMARY =============== Two power circuits under the floor went offline and all connected Power supplies lost power. Normally a circuit going offline wouldn't introduce a problem due to our Power Layout (detailed below for those who are interested). In this case, since two circuits went offline and were with adjoining racks, this affected multiple systems on the two affected racks. OVERALL IMPACT ============== Due to the wide array of services affected, I will keep it brief for each system. All listed began at the same time (4:14PM EDT) OpenSRS : Offline, including AWI, RWI, API, Batch, Whois, Mailer, due to Oracle Database being offline OpenHRS : Offline, including AWI, RWI, API, Whois, Mailer, due to Oracle Database being offline Tulips : Offline including AWI, RWI, API, EMail Admin., due to Oracle Database being offline EMail : Offline including POP, IMAP, Webmail, API, due to all 3 frontends losing power EMail Defense : Degraded (Web Portal unavailable), due to Postgres Database being offline Also lost power to 6 of 16 frontend servers - no problems with mail delays reported Tucows Main Website (www.tucows.com) : Heavily degraded due to 2 out of 3 servers losing power Managed DNS : Core DNS was functional but URL redirects (essential component) offline, due to Oracle Database being offline Digital Certificates : Digital Certificates operational, Provisioning & management offline, due to Oracle Database being offline Website Builder Service : Builder available, Provisioning & management offline, due to Oracle Database being offline CURRENT STATUS ============== The Oracle database which services OpenSRS, OpenHRS, Tulips & Managed DNS is still offline due to hardware/systemic issues when trying to start backup up. We are working to restore asap. Otherwise, everything else is back online and were restored at about 7:45PM EDT. SYSTEM STATUS PAGE ================== There are some deficiencies with the System Status page which in effect, have resulted in incorrect reporting of the status of Blogware (showing as Offline). This was fixed earlier tonight via the database but a scheduled process appears to have modified it to display incorrectly yet again. To be clear, Blogware has remained online for the duration of these events. POWER LAYOUT ============ Each rack has a single power circuit under the floor. Each rack houses (2) RPCs (Power Units with 8 ports for 8 power supplies). The first RPC in each rack is connected to the power circuit underneath the rack. The second RPC is connected to a second circuit underneath an adjoining rack. Each server has dual power supplies and is connected to RPC#1 & RPC#2. This effectively results in fully redundant power to each individual server. A further update will be provided asap regarding the Oracle database when there is something to report. Please be patient as we are nearing the end of this nightmare. -- _________________________________________________ Joey deVilla - Tucows, Inc. - [EMAIL PROTECTED] TC/DC (Technical Community Development Coordinator) "Nerdy Deeds Done Dirt Cheap" _______________________________________________ domains-gen mailing list [email protected] http://discuss.tucows.com/mailman/listinfo/domains-gen
