The problems are known with Btrfs raid1, but I think they bear repeating because they are really not OK.
In the exact same described scenario: a simple clear cut drop off of a member device, which then later clearly reappears (no transient failure). Both mdadm and LVM based raid1 would have re-added the missing device and resynced it because internal bitmap is the default (on > 100G arrays for mdadm and always with lvm). Only the new data would be propagated to user space. Both mdadm and lvm have metadata to know which drive has stale data in this common scenario. Btrfs does two, maybe three, bad things: 1. No automatic resync. This is a net worse behavior than mdadm and lvm, putting data at risk. 2. The new data goes in a single chunk; even if the user does a manual balance (resync) their data isn't replicated. They must know to do a -dconvert balance to replicate the new data. Again this is a net worse behavior than mdadm out of the box, putting user data at risk. 3. Apparently if nodatacow, given a file with two copies of different transid, Btrfs won't always pick the higher transid copy? If true that's terrible, and again not at all what mdadm/lvm are doing. Btrfs can do better because it has more information available to make unambiguous decisions about data. But it needs to always do at least as good a job as mdadm/lvm and as reported, that didn't happen. So some tested is needed in particular case #3 above with nodatacow. That's a huge bug, if it's true. Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html