On Mon, Feb 4, 2019 at 11:46 PM Chris Murphy <li...@colorremedies.com> wrote: > > After remounting both devices and scrubbing, it's dog slow. 14 minutes > to scrub a 4GiB file system, complaining the whole time about > checksums on the files not replicated. All it appears to be doing is > replicating metadata at a snails pace, less than 2MB/s.
OK I see what's going on. The raid1 data chunk was not full, so initially rw,degraded writes went there. New writes went to a single chunk. Upon unmounting, restoring the missing device, and mounting normally: Data,single: Size:13.00GiB, Used:12.91GiB /dev/mapper/vg-test2 13.00GiB Data,RAID1: Size:2.00GiB, Used:1.98GiB /dev/mapper/vg-test1 2.00GiB /dev/mapper/vg-test2 2.00GiB Metadata,single: Size:1.00GiB, Used:0.00B /dev/mapper/vg-test2 1.00GiB Metadata,RAID1: Size:1.00GiB, Used:15.91MiB /dev/mapper/vg-test1 1.00GiB /dev/mapper/vg-test2 1.00GiB System,single: Size:32.00MiB, Used:16.00KiB /dev/mapper/vg-test2 32.00MiB System,RAID1: Size:8.00MiB, Used:0.00B /dev/mapper/vg-test1 8.00MiB /dev/mapper/vg-test2 8.00MiB So it's demoted system chunk to single profile, new data chunk is also single profile. And even though it created a single profile metadata chunk it's not using it, instead it continues to use the not full raid1 profile metadata chunks, presumably until they're all full and then only once new data chunks need to be allocated are they single chunk. mdadm and LVM upon assembly once all devices are present again, detects the stale device from its lower count, and knows what blocks to replicate from the write intent bitmap, and starts this sync/replication right away - before even mounting the file system. So Btrfs is neither automatic, nor obvious that you have to do a *balance* rather than a scrub in this case, which looks like it only happens in the single device degraded case (I assume if it were a 3 device array with a missing device, raid1 chunks can still be created and thus this situation doesn't happen). With a very new file system, perhaps most of the data written while rw,degraded mounted goes to single profile chunks. That permits use of the soft filter when converting to avoid full balance (a full sync). However, that's not certain. So the safest single option is unfortunately a full balance with convert filter only. The most efficient is to use both convert and soft filter (for data only; metadata must be hard converted); followed by a scrub. *sigh* it's non obvious that the user must intervene, and then also what they need to do is non-obvious. For sure mdadm and LVM are better in this case, simply because it does the right thing to re-establish the expected replication automatically. -- Chris Murphy