----- Original Message ----- From: "James Snow" <s...@teardrop.org>

I have a ZFS server on which I've seen periodic checksum errors on
almost every drive. While scrubbing the pool last night, it began to
report unrecoverable data errors on a single file.

I compared an md5 of the supposedly corrupted file to an md5 of the
original copy, stored on different media. They were the same, suggesting
no corruption.
...

Had this before, has always turned out to be failing hardware. Its
been a mixture of faults for us:-
1. Memory, even though ECC and not reporting failures in use or
via memtest.
2. CPU / Northbridge on old AMD's, not 100% sure which. This started
as ZFS checksum issues and then weeks / months later resulting in
random untraceable panic and watchdog timeouts in bge nic.
Disabling the cores on the second CPU fixed this for us on two separate
machines e.g.
/boot/loader.conf
hint.lapic.2.disabled="1"
hint.lapic.3.disabled="1"

So while ZFS can report errors on files, that aren't errors on the
disks themselves and hence the data, as you confirmed, is fine don't
ignore it.

   Regards
   Steve

================================================
This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it.
In the event of misdirection, illegible or incomplete transmission please 
telephone +44 845 868 1337
or return the E.mail to postmas...@multiplay.co.uk.

_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Reply via email to