On 4/18/12 12:35 PM, Jeroen van Aart wrote:
 Laurent GUERBY wrote:
> Do you have reference to recent papers with experimental data about
> non ECC memory errors? It should be fairly easy to do
 Maybe this provides some information:

 http://en.wikipedia.org/wiki/ECC_memory#Problem_background

 "Work published between 2007 and 2009 showed widely varying error
 rates with over 7 orders of magnitude difference, ranging from
 10−10−10−17 error/bit·h, roughly one bit error, per hour, per
 gigabyte of memory to one bit error, per century, per gigabyte of
 memory.[2][4][5] A very large-scale study based on Google's very
 large number of servers was presented at the
 SIGMETRICS/Performance’09 conference.[4] The actual error rate found
 was several orders of magnitude higher than previous small-scale or
 laboratory studies, with 25,000 to 70,000 errors per billion device
 hours per megabit (about 3–10×10−9 error/bit·h), and more than 8% of
 DIMM memory modules affected by errors per year."
Dear Jeroen,

In the work that led up to RFC3309, many of the errors found on the Internet pertained to single interface bits, and not single data bits. Working at a large chip manufacturer that removed internal memory error detection to foolishly save space, cost them dearly in then needing to do far more exhaustive four corner testing. Checksums used by TCP and UDP are able to detect single bit data errors, but may miss as much as 2% of single interface bit errors. It would be surprising to find memory designs lacking internal error detection logic.

Regards,
Douglas Otis


Reply via email to