Thanks Qu!
I thought as much from following the mailing list and your great work
over the years!
Would it be possible to get the wiki updated to reflect the current
"real" status?
From Qu's statement and perspective, there's no difference to other
non-BTRFS software RAID56's out there that are marked as stable (except
ZFS).
Also there are no "multiple serious data-loss bugs".
Please do consider my proposal as it will decrease the amount of
incorrect paranoia that exists in the community.
As long as the Wiki properly mentions the current state with the options
for mitigation; like backup power and perhaps RAID1 for metadata or
anything else you believe as appropriate.
Thanks,
DP
On 28/1/19 11:52 am, Qu Wenruo wrote:
On 2019/1/26 下午7:45, DanglingPointer wrote:
Hi All,
For clarity for the masses, what are the "multiple serious data-loss
bugs" as mentioned in the btrfs wiki?
The bullet points on this page:
https://btrfs.wiki.kernel.org/index.php/RAID56
don't enumerate the bugs. Not even in a high level. If anything what
can be closest to a bug or issue or "resilience use-case missing" would
be the first point on that page.
"Parity may be inconsistent after a crash (the "write hole"). The
problem born when after "an unclean shutdown" a disk failure happens.
But these are *two* distinct failures. These together break the BTRFS
raid5 redundancy. If you run a scrub process after "an unclean shutdown"
(with no disk failure in between) those data which match their checksum
can still be read out while the mismatched data are lost forever."
So in a nutshell; "What are the multiple serious data-loss bugs?".
There used to be two, like scrub racing (minor), and screwing up good
copy when doing recovery (major).
Although these two should already be fixed.
So for current upstream kernel, there should be no major problem despite
write hole.
Thanks,
Qu
If
there aren't any, perhaps updating the wiki should be considered for
something less the "dramatic" .