On Feb 21, 2:48 pm, davide...@cs.cmu.edu (Dave Eckhardt) wrote:

> * Bits were flipping pretty often.  I think we got 10-ish events
> per day.

TLB bits are not like DRAM bits.  They were surely static cells, built
for speed and functionality (CAM) not density.  The cells would be
quite large.  It is unlikely that this problem came from external
radiation.  Guess: the problem was a marginal design of the circuitry.

At about that time DRAM cells seemed to be suffering from radiation-
induced bit flips.  It was felt that 16Kbit chips would be the limit
because of this (please realise that my own memory might be slightly
faulty).  It turned out that the radiation was actually coming from
the chip packaging material.  Once that was sorted, RAM density
marched on to where we are now.

As cells shrink, and voltages shrink, I understand that radiation can
have greater effects.  Eventually mainstream systems will have ECC.
But I've been thinking this for as long as there have been personal
computers built out of microprocessors.

Adding ECC to memory seems to me to be an easy no-brainer.  Adding it
"everywhere" in processors does not seem easy.

Actually, even adding it in memory isn't that easy.  In the old days,
a simple Hamming code was good enough because each bit in a word lived
on a different chip.  Now memory chips are wider and so the code has
to account for multi-bit errors (flipping of bits is not independent).

Cray famously said "Parity is for farmers".  It was an obscure joke
(referring to some US agricultural subsidy) but really he meant that
he didn't want to waste circuitry on error checking (as I  understand
it).  This was one of the things that made me averse to his systems.

It is really hard to guess what the conversion rate of bit flips into
observed anomalies on ordinary systems.  I wonder if any research has
been done on this.  In the real world, software bugs take surely most
of the blame.  Users seem to have been trained to accept lower
reliability in computer systems.

Apple seems to be one of the few vendors that might be able to market
the idea of ECC to consumers.

Reply via email to