in infinite wisdom shiv sastry spoke thus On 12/29/2006 10:32 PM:
There was a SciAm article on the subject
Dependable Software by Design
http://www.sciam.com/article.cfm?articleID=00020D04-CFD8-146C-8D8D83414B7F0000&pageNumber=2&catID=2
If you subscribe to the Risks Lists (http://www.risks.org), every week
you get to see how systems can fail inspite of the best efforts of its
creators. Here is one such incident that I doubt any software
engineer(hehe) would be able to figure out
"
Date: Fri, 22 Dec 2006 10:13:36 -0600
From: "Ted Lee" <[EMAIL PROTECTED]>
Subject: Re: Trig error checking (RISKS-24.51/52)
Speaking of spurious faults, which "mike martin" <[EMAIL PROTECTED]>
did in RISKS-24.52, I am reminded of an amusingly insidious fault I
ended up tracking down at the PDP-1 at the Cambridge Electron
Accelerator ca. 1968. The machine was used primarily to run experiments,
but one of the professors had the idea of also using it as a teaching
aid. The machine had been retrofitted with memory protection hardware
so several experimenters could run their software at once without
stepping on each other's toes. (As I recall, it didn't have any address
translation, just protection) I ran a program (n-body simulator for
elementary physics classes) I'd written that had been working fine --
and it came up with a memory fault, repeatedly. I tracked the fault
down to happening in a display subroutine, in particular, a subroutine
to draw a circle. I vaguely remember simplifying everything so all I
was doing was drawing a single large circle (like a foot in diameter --
the screen was huge) -- and the machine and display were slow enough I
could see that the fault happened exactly at something like the top of
the screen. The only "interesting" thing about that is that it was at a
point where the value in the accumulator would have been all 1's and on
the next iteration overflowed to all 0's. For any of you old enough to
know what a real computer was like, the buses in this machine were
bundles of wires or flat cables with something like 18 wires in them.
It turns out that the single wire (and it really was a single wire that
just sort of hung across the electronic racks) that carried the signal
indicating a protection violation had been routed close to the
accumulator: the sudden energy of all the bits turning from 1 to 0 got
coupled into that wire and caused the fault.
"
--
raj shekhar
facts: http://rajshekhar.net | opinions: http://rajshekhar.net/blog
I dare do all that may become a man; Who dares do more is none.