At 01:15 PM 11/26/2007, Bruno Coutinho wrote:
I heard that the major source of memory corruption in servers is the
memory bus.
And this becomes worse as you add memory sticks.
With 8 memory stics that have 8 chips in both sides, you has 128 chips.
So the main purpose of ECC is correcting bus errors.
This is a real possibility. The raw error rate on the chips is quite low.
Mike Sanor, compatibility and performance manager at Crucial
Technology, a division of DRAM manufacturer Micron Technology that
sells memory directly to end users is quoted saying:
ECC is most useful for "servers and precision workstations, but not
commodity desktops. The reason is simple: The error rate in today's
consumer-level memory is so low so that for most everyday
applications, adding ECC is pure overkill. For standard DDR2 memory,
the error rate is something like 100 soft errors over 1 billion
device hours. If there are 16 memory devices or chips on a given
module, that translates to one soft error every 30 years. Even if you
only have two such DIMMs in a system, that's still less than one
error for more than the lifetime of the system as a whole.
_______________________________________________
Beowulf mailing list, [email protected]
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf