On Wed, Mar 28, 2018 at 09:33:38AM -0400, Paul Koning via cctalk wrote: [...] > The basic assumption is that failures are "fail stop", i.e., a drive refuses > to deliver data. (In particular, it doesn't lie -- deliver wrong data. You > can build systems that deal with lying drives but RAID is not such a system.) > The failure may be the whole drive ("it's a door-stop") or individual blocks > (hard read errors).
The assumption that disks don't lie is demonstrably false, and anybody who still designs or sells a system in 2018 which makes that assumption is a charlatan. I have hardware which proves it. Sun's ZFS filesystem applies an extra "trust but verify" layer of protection using strong checksums. I have a server with a pair of mirrored 3TB enterprise disks which are "zfs scrub"bed (surface-scanned and checksums verified) weekly. Every few months, the scrub will hit a bad checksum which shows that the disk read back different data to that which was written, even though the disk claimed the read was OK. At best (and most likely) the problem was a single bit flip, i.e. roughly a 1 in 1.8e13 error rate. So much for the manufacturer's claim of less than 1 in 1e15 for that model of disk. A workstation with a pair of 512GB consumer-grade SSDs has a half-dozen bad stripes in every scrub performed after the machine has been powered down for a week or so. The SSDs have just a few hundred hours on the clock and perhaps three full drive writes. I love the performance of SSDs, but they are appallingly unreliable for even medium-term storage. Fortunately, ZFS can tell from the checksums which half of the mirror is lying, and thus rewrite the stripe based on the known-good copy. It even handles the case where both disks have some errors. Traditional RAID just cannot self-heal like that.