Re: Major design flaw with BTRFS Raid, temporary device drop will corrupt nodatacow files

Chris Murphy Thu, 28 Jun 2018 07:17:58 -0700

The problems are known with Btrfs raid1, but I think they bear
repeating because they are really not OK.


In the exact same described scenario: a simple clear cut drop off of a
member device, which then later clearly reappears (no transient
failure).

Both mdadm and LVM based raid1 would have re-added the missing device
and resynced it because internal bitmap is the default (on > 100G
arrays for mdadm and always with lvm). Only the new data would be
propagated to user space. Both mdadm and lvm have metadata to know
which drive has stale data in this common scenario.

Btrfs does two, maybe three, bad things:
1. No automatic resync. This is a net worse behavior than mdadm and
lvm, putting data at risk.
2. The new data goes in a single chunk; even if the user does a manual
balance (resync) their data isn't replicated. They must know to do a
-dconvert balance to replicate the new data. Again this is a net worse
behavior than mdadm out of the box, putting user data at risk.
3. Apparently if nodatacow, given a file with two copies of different
transid, Btrfs won't always pick the higher transid copy? If true
that's terrible, and again not at all what mdadm/lvm are doing.


Btrfs can do better because it has more information available to make
unambiguous decisions about data. But it needs to always do at least
as good a job as mdadm/lvm and as reported, that didn't happen. So
some tested is needed in particular case #3 above with nodatacow.
That's a huge bug, if it's true.


Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Major design flaw with BTRFS Raid, temporary device drop will corrupt nodatacow files

Reply via email to