On 04/ 4/10 10:00 AM, Willard Korfhage wrote:
What should I make of this? All the disks are bad? That seems
unlikely. I found another thread
http://opensolaris.org/jive/thread.jspa?messageID=399988
where it finally came down to bad memory, so I'll test that. Any
other suggestions?
It could be the cpu. I had a very bizarre case where the cpu would
sometimes miscalculate the checksums of certain files and mostly
when the cpu was also busy doing other things. Probably the cache.
Days of running memtest and SUNWvts didn't result in any errors
because this was a weirdly pattern sensitive problem. However, I
too am of the opinion that you shouldn't even think of running zfs
without ECC memory (lots of threads about that!) and that this
is far, far more likely to be your problem, but I wouldn't count on
diagnostics finding it, either. Of course it could be the controller too.
For laughs, the cpu calculating bad checksums was discussed in
http://opensolaris.org/jive/message.jspa?messageID=469108
(see last message in the thread).
If you are seriously contemplating using a system with
non-ECC RAM, check out the Google research mentioned in
http://opensolaris.org/jive/thread.jspa?messageID=423770
http://www.cs.toronto.edu/%7Ebianca/papers/sigmetrics09.pdf
Cheers -- Frank
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss