On Fri, Aug 28, 2015 at 3:35 AM, Hugo Mills <h...@carfax.org.uk> wrote: > On Fri, Aug 28, 2015 at 10:50:12AM +0200, George Duffield wrote: >> Running a traditional raid5 array of that size is statistically >> guaranteed to fail in the event of a rebuild. > > Except that if it were, you wouldn't see anyone running RAID-5 > arrays of that size and (considerably) larger. And successfully > replacing devices in them. > > As I understand it, the calculations that lead to the conclusion > you quote are based on the assumption that the bit error rate (BER) of > the drive is applied on all reads -- this is not the case. The BER is > the error rate of the platter after the device has been left unread > (and powered off) for some long period of time. (I've seen 5 years > been quoted for that).
I think the confusion comes from the Unrecovered Read Error (URE) or "Non-recoverable read errors per bits read" in the drive spec sheet. e.g. on a WDC Red this is written as "<1 in 10^14" but this gets (wrongly) reinterpreted into an *expected* URE once every 12.5TB (not TiB) read, which is of course complete utter bullshit. But it gets repeated all the time. It's as if symbols have no meaning, and < is some sort of arrow, or someone got bored and just didn't want to use a space. That symbol makes the URE value a maximum for what is ostensibly a scientific sample of drives. We have no idea what the minimum is, we don't even know the mean, and it's not in the manufacturer's best interest to do that. The mean between consumer SATA and enterprise SAS may not be all that different, while the maximum is two orders magnitude better for enterprise SAS so it makes sense to try to upsell us with that promise. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html