On Sun, 21 Apr 2019 14:21:27 +0100 Joseph Graham <jos...@xylon.me.uk> wrote: > > In fact, in many filesystems there are very weak – or no! – guarantees that > > the data you're reading is actually correct. Systems like ext4 simply assume > > that the data written to the disk will never change. AFAIK, it has > > essentially no mechanism at all to deal with silent data corruption. > > It's not fair to say there's "no mechanism at all to deal with silent > data corruption". The hard-disk/ssd does checksum every block. If a block > fails a checksum the disk keeps trying until it reads a block that > matches the checksum, else gives up with a read-error. > > So really it's a matter of whether you trust your drives to do their > job correctly.
Unfortunately it's not that simple; from [1]: > Finding (1): In addition to disk failures (20-55%),physical interconnect > failures make up a significant part (27-68%) of storage subsystem > failures. Protocol failures and performance failures both make up > noticeable fractions. > > Implications: Disk failures are not always a dominant factor of storage > subsystem failures, and a reliability study for storage subsystems cannot > only focus on disk failures.Resilient mechanisms should target all failure > types. The Annualized failure rate for these kind of silent errors is about 3-4%. That's pretty high! Also from the ZFS authors[2]: > - Wrote a simple application to write/verify 1GB file > - Write 1MB, sleep 1 second, etc. until 1GB has been written > - Read 1MB, verify, sleep 1 second, etc. > - Ran on 3000 rack servers with HW RAID card > - After 3 weeks, found 152 instances of silent data corruption > - Previously thought “everything was fine” > - HW RAID only detected “noisy” data errors > - Need end-to-end verification to catch silent data corruption There's much more research on this; this is just the first I found. This is the reason that pretty much all these newer filesystems have checksums. [1]: https://www.usenix.org/legacy/event/fast08/tech/full_papers/jiang/jiang.pdf [2]: https://www.snia.org/sites/default/orig/sdc_archives/2008_presentations/monday/JeffBonwick-BillMoore_ZFS.pdf
pgpbCPKc01kCr.pgp
Description: OpenPGP digital signature