On Feb 17, 2010, at 4:29 AM, Alan Burlison wrote:

> Derek Cicero wrote:
> 
>> There is some sort of network issue that we are having trouble diagnosing. 
>> There is a failover set-up for the SCM machines, but the back-end network is 
>> having intermittent connectivity issues so it's just flapping back and 
>> forth. We'll let you know as soon as we have more information.
> 
> It appears we have a faulty network switch.  Investigation is still ongoing, 
> but at the moment the fault seems to have cleared, if only temporarily.  We 
> expect further outages, and will send out an all clear once we have the 
> problem fixed permanently,

Not quite. 

The issue was with a cross-connect cable between the switches in two of the 
cabinets containing our systems which caused various degrees of network 
non-connectivity between the systems on one of the networks which many of the 
applications depend upon. 

The cable was replaced at roughly 3am ET/12am PT and all services should have 
returned at that time. The only service that should have been noticeably 
unavailable during the outage would have been the SCM systems behind hg and 
svn.opensolaris.org because the service was unable to sucessfully failover to 
the backup system because neither of the SCM systems had access to all the 
parts necessary for it to function properly due to the network issue.

The website, email and mailing lists, opengrok, etc. should have remained 
largely unaffected. If anyone continues to have problems either 
pushing/committing to the repositories or sees any other problems with the 
systems, please let us know.

e.
_______________________________________________
website-discuss mailing list
[email protected]

Reply via email to