On Mon, Feb 4, 2019 at 11:46 PM Chris Murphy <li...@colorremedies.com> wrote:
>
> After remounting both devices and scrubbing, it's dog slow. 14 minutes
> to scrub a 4GiB file system, complaining the whole time about
> checksums on the files not replicated. All it appears to be doing is
> replicating metadata at a snails pace, less than 2MB/s.


OK I see what's going on. The raid1 data chunk was not full, so
initially rw,degraded writes went there. New writes went to a single
chunk. Upon unmounting, restoring the missing device, and mounting
normally:

Data,single: Size:13.00GiB, Used:12.91GiB
   /dev/mapper/vg-test2      13.00GiB

Data,RAID1: Size:2.00GiB, Used:1.98GiB
   /dev/mapper/vg-test1       2.00GiB
   /dev/mapper/vg-test2       2.00GiB

Metadata,single: Size:1.00GiB, Used:0.00B
   /dev/mapper/vg-test2       1.00GiB

Metadata,RAID1: Size:1.00GiB, Used:15.91MiB
   /dev/mapper/vg-test1       1.00GiB
   /dev/mapper/vg-test2       1.00GiB

System,single: Size:32.00MiB, Used:16.00KiB
   /dev/mapper/vg-test2      32.00MiB

System,RAID1: Size:8.00MiB, Used:0.00B
   /dev/mapper/vg-test1       8.00MiB
   /dev/mapper/vg-test2       8.00MiB


So it's demoted system chunk to single profile, new data chunk is also
single profile. And even though it created a single profile metadata
chunk it's not using it, instead it continues to use the not full
raid1 profile metadata chunks, presumably until they're all full and
then only once new data chunks need to be allocated are they single
chunk.

mdadm and LVM upon assembly once all devices are present again,
detects the stale device from its lower count, and knows what blocks
to replicate from the write intent bitmap, and starts this
sync/replication right away - before even mounting the file system. So
Btrfs is neither automatic, nor obvious that you have to do a
*balance* rather than a scrub in this case, which looks like it only
happens in the single device degraded case (I assume if it were a 3
device array with a missing device, raid1 chunks can still be created
and thus this situation doesn't happen).

With a very new file system, perhaps most of the data written while
rw,degraded mounted goes to single profile chunks. That permits use of
the soft filter when converting to avoid full balance (a full sync).
However, that's not certain. So the safest single option is
unfortunately a full balance with convert filter only. The most
efficient is to use both convert and soft filter (for data only;
metadata must be hard converted); followed by a scrub.

*sigh* it's non obvious that the user must intervene, and then also
what they need to do is non-obvious. For sure mdadm and LVM are better
in this case, simply because it does the right thing to re-establish
the expected replication automatically.

-- 
Chris Murphy

Reply via email to