It says right in the article they were running Windows 2000 Advanced Server. The systems were not impacted by the Win95 hang bug. Almost certainly Windows was fine... period. The communication software puked based on the same API function that the Windows 95 Dev guys screwed up with. The value rolls over and either the application software detects that and shuts down the application or the application crashes because of poor exception handling.
If Windows crashed of its own accord, then yes, MS needs to share some blame. If, what actually happened is a crappy app died and the OS was fine the whole time, the responsibility rests with the application vendor and the design/implementation team. Should technicians be rebooting boxes as fixes. Absolutely not. However, before assuming it is an OS issue, understand why they are rebooting it. In this case I expect it was to reset the tick count for the application itself. If it is because the app is eating all the memory up, that is one hellacious memory leak they need to work on in the app. joe -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Michal Zalewski Sent: Friday, September 24, 2004 2:32 PM To: ASB Cc: [EMAIL PROTECTED] Subject: Re: [Full-Disclosure] Windoze almost managed to 200x repeat 9/11 On Fri, 24 Sep 2004, ASB wrote: > "The servers are timed to shut down after 49.7 days of use in order to > prevent a data overload, a union official told the LA Times." > > How you managed to read "OS failure" into this is rather astounding... The statement above, even though either cleverly disguised by the authorities, or mangled by the press, does ring a bell. It is not about applications eating up too much memory, hence requiring an occassional reboot, oh no. Windows 9x had a problem (fixed by Microsoft, by the way) that caused them to hang or crash after a jiffie counter in the kernel overflowed: http://support.microsoft.com/support/kb/articles/q216/6/41.asp It would happen precisely after 49.7 days. Coincidence? Not very likely. It seems that the system was running on unpatched Windows 95 or 98, and rather than deploying a patch, they came up with a maintenance procedure requiring a scheduled reboot every 30 days. This is one hell of a ridiculous idea, and any attempt to blame a failure on a technician who failed to reboot the box is really pushing it. It is not uncommon for telecommunications, medical, flight control, banking and other mission-critical applications to run on terribly ancient software (and with a clause that requires them NOT to be updated, because the software is not certified against those patches). In the end, the OS and decision-makers that implemented the system and established ill-conceived workarounds should split the blame. /mz _______________________________________________ Full-Disclosure - We believe in it. Charter: http://lists.netsys.com/full-disclosure-charter.html _______________________________________________ Full-Disclosure - We believe in it. Charter: http://lists.netsys.com/full-disclosure-charter.html