https://btrfs.wiki.kernel.org/index.php/Status
Scrub + RAID56 Unstable will verify but not repair

This doesn't seem quite accurate. It does repair the vast majority of
the time. On scrub though, there's maybe a 1 in 3 or 1 in 4 chance bad
data strip results in a.) fixed up data strip from parity b.) wrong
recomputation of replacement parity c.) good parity is overwritten
with bad, silently, d.) if parity reconstruction is needed in the
future e.g. device or sector failure, it results in EIO, a kind of
data loss.

Bad bug. For sure.

But consider the identical scenario with md or LVM raid5, or any
conventional hardware raid5. A scrub check simply reports a mismatch.
It's unknown whether data or parity is bad, so the bad data strip is
propagated upward to user space without error. On a scrub repair, the
data strip is assumed to be good, and good parity is overwritten with
bad.

So while I agree in total that Btrfs raid56 isn't mature or tested
enough to consider it production ready, I think that's because of the
UNKNOWN causes for problems we've seen with raid56. Not the parity
scrub bug which - yeah NOT good, not least of which is the data
integrity guarantees Btrfs is purported to make are substantially
negated by this bug. I think the bark is worse than the bite. It is
not the bark we'd like Btrfs to have though, for sure.


-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to