On 04/ 4/10 10:00 AM, Willard Korfhage wrote:

What should I make of this? All the disks are bad? That seems
unlikely. I found another thread

http://opensolaris.org/jive/thread.jspa?messageID=399988

where it finally came down to bad memory, so I'll test that. Any
other suggestions?

It could be the cpu. I had a very bizarre case where the cpu would
sometimes miscalculate the checksums of certain files and mostly
when the cpu was also  busy doing other things. Probably the cache.

Days of running memtest and SUNWvts didn't result in any errors
because this was a weirdly pattern sensitive problem. However, I
too am of the opinion that you shouldn't even think of running zfs
without ECC memory (lots of threads about that!) and that this
is far, far more likely to be your problem, but I wouldn't count on
diagnostics finding it, either. Of course it could be the controller too.

For laughs, the cpu calculating bad checksums was discussed in
http://opensolaris.org/jive/message.jspa?messageID=469108
(see last message in the thread).

If you are seriously contemplating using a system with
non-ECC RAM, check out the Google research mentioned in
http://opensolaris.org/jive/thread.jspa?messageID=423770
http://www.cs.toronto.edu/%7Ebianca/papers/sigmetrics09.pdf

Cheers -- Frank

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to