If you clicked on the link to this topic: Thank you!

I have the following setup:

6x 500GB HDD-Drives
1x 32GB NVME-SSD (Intel Optane)

I used bcache to setup up the SSD as caching device and all other six drives are backing devices. After all that was in place, I formatted the six HHDs with btrfs in RAID5. Everything works as expected for the last 7 months now.

By now I have a spare of 6x 2TB HDD drives and I want to replace the old 500GB disks one by one. So I started with the first one by deleting it from the btrfs. This worked fine, I had no issues there. After that I cleanly detached the empty disk from bcache, still everything is fine, so I removed it. Here are the commandlines for this:

    sudo btrfs device delete /dev/bcacheX /media/raid
    cat /sys/block/bcacheX/bcache/state
    cat /sys/block/bcacheX/bcache/dirty_data
    sudo sh -c "echo 1 > /sys/block/bcacheX/bcache/detach"
    cat /sys/block/bcacheX/bcache/state

After that I installed one of 2TB drives, attached it to bcache and added it to the raid. The next step was to balance the data over to the new drive. Please see the commandlines:

    sudo make-bcache -B /dev/sdY
sudo sh -c "echo '60a63f7c-2e68-4503-9f25-71b6b00e47b2' > /sys/block/bcacheY/bcache/attach"
    sudo sh -c "echo writeback > /sys/block/bcacheY/bcache/cache_mode"
    sudo btrfs device add /dev/bcacheY /media/raid
    sudo btrfs fi ba start /media/raid/

The balance worked fine until ~164GB were written to the new drive, this is about 50% of the data to be balanced. Suddenly write errors on the disk appear. The Raid slowly became unusable (I was running 3 VMs of the RAID while balancing). I think it worked for some time due to the SSD commiting the writes. At some point the balancing stopped and I was only able to kill the VMs. I checked the I/Os on the disks and the SSD spit out constant 1,2 GB/s read. I think the bcache somehow delivered data to the btrfs and it got rejected there and requested again, but this is just a guess. Anyway, I ended up resetting the host and I physically disconnected the broken disk and put a new one in place. I also created a bcache backing device on it and issued the following command to replace the faulty disk:

    sudo btrfs replace start -r 7 /dev/bcache5 /media/raid

The filesystem needs to be mounted read/write for this command to work. It is now doing its work, but very slow, about 3,5 MB/s. Unfortunately the syslog reports a lot of these messages:

    ...
    scrub_missing_raid56_worker: 62 callbacks suppressed
BTRFS error (device bcache0): failed to rebuild valid logical 4929143865344 for dev (null)
    ...
BTRFS error (device bcache0): failed to rebuild valid logical 4932249866240 for dev (null)
    scrub_missing_raid56_worker: 1 callbacks suppressed
BTRFS error (device bcache0): failed to rebuild valid logical 4933254250496 for dev (null)
    ....

If I try to read a file from the filesystem, the output-command fails with a simple I/O error and the syslog shows something entries similar to this:

BTRFS warning (device bcache0): csum failed root 5 ino 1143 off 7274496 csum 0xccccf554 expected csum 0x6340b527 mirror 2

So far, so good (or bad). It took about 6 hours for 4,3% of the replacement so far. No read or write errors have been reported for the replacement procedure ("btrfs replace status"). I will let it to its thing until finished. Before the first 2TB disk failed, 164 GB of data have been written according to "btrfs filesystem show". If I check the amount of data written to the new drive, the 4,3% represent about 82 GB (according to /proc/diskstats). I don't know how to interpret this, but anyway.

And now finally my questions: If the replace command finishes successfully, what should I do next. A scrub? A balance? Another backup? ;-) Do you see anything that I have done wrong in this procedure? Do the warnings and the errors reported from btrfs mean, that the data is lost? :-(

Here is some additional info (**edited**):

    $ sudo btrfs fi sh
    Total devices 7 FS bytes used 1.56TiB
    Label: none  uuid: 9f765025-5354-47e4-afcc-a601b2a52703
    devid    0 size 1.82TiB used 164.03GiB path /dev/bcache5
    devid    1 size 465.76GiB used 360.03GiB path /dev/bcache4
    devid    3 size 465.76GiB used 360.00GiB path /dev/bcache3
    devid    4 size 465.76GiB used 359.03GiB path /dev/bcache1
    devid    5 size 465.76GiB used 360.00GiB path /dev/bcache0
    devid    6 size 465.76GiB used 360.03GiB path /dev/bcache2
    *** Some devices missing

    $ sudo btrfs dev stats /media/raid/
    [/dev/bcache5].write_io_errs    0
    [/dev/bcache5].read_io_errs     0
    [/dev/bcache5].flush_io_errs    0
    [/dev/bcache5].corruption_errs  0
    [/dev/bcache5].generation_errs  0
    [/dev/bcache4].write_io_errs    0
    [/dev/bcache4].read_io_errs     0
    [/dev/bcache4].flush_io_errs    0
    [/dev/bcache4].corruption_errs  0
    [/dev/bcache4].generation_errs  0
    [/dev/bcache3].write_io_errs    0
    [/dev/bcache3].read_io_errs     0
    [/dev/bcache3].flush_io_errs    0
    [/dev/bcache3].corruption_errs  0
    [/dev/bcache3].generation_errs  0
    [/dev/bcache1].write_io_errs    0
    [/dev/bcache1].read_io_errs     0
    [/dev/bcache1].flush_io_errs    0
    [/dev/bcache1].corruption_errs  0
    [/dev/bcache1].generation_errs  0
    [/dev/bcache0].write_io_errs    0
    [/dev/bcache0].read_io_errs     0
    [/dev/bcache0].flush_io_errs    0
    [/dev/bcache0].corruption_errs  0
    [/dev/bcache0].generation_errs  0
    [/dev/bcache2].write_io_errs    0
    [/dev/bcache2].read_io_errs     0
    [/dev/bcache2].flush_io_errs    0
    [/dev/bcache2].corruption_errs  0
    [/dev/bcache2].generation_errs  0
    [devid:7].write_io_errs    9525186
    [devid:7].read_io_errs     10136573
    [devid:7].flush_io_errs    143
    [devid:7].corruption_errs  0
    [devid:7].generation_errs  0

    $ sudo btrfs fi df /media/raid/
    Data, RAID5: total=1.56TiB, used=1.55TiB
    System, RAID1: total=64.00MiB, used=128.00KiB
    Metadata, RAID1: total=4.00GiB, used=2.48GiB
    GlobalReserve, single: total=512.00MiB, used=0.00B

    $ uname -a
Linux hostname 4.15.0-36-generic #39-Ubuntu SMP Mon Sep 24 16:19:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

    $ btrfs --version
    btrfs-progs v4.15.1

    $ cat /etc/lsb-release
    DISTRIB_ID=Ubuntu
    DISTRIB_RELEASE=18.04
    DISTRIB_CODENAME=bionic
    DISTRIB_DESCRIPTION="Ubuntu 18.04.1 LTS"

The 'device replace' just finished. It was next to the 9% mark, I think this percentage matches the written amount of data on the drive: 164 GiB from a total size of 1,82 TiB. So 100% would mean a full 2TB replacemant. So here some additional output:

    $ btrfs replace status -1 /media/raid/
Started on 30.Oct 08:16:53, finished on 30.Oct 21:05:22, 0 write errs, 0 uncorr. read errs

    $ sudo btrfs fi sh
    Label: none  uuid: 9f765025-5354-47e4-afcc-a601b2a52703
    Total devices 6 FS bytes used 1.56TiB
    devid    1 size 465.76GiB used 360.03GiB path /dev/bcache4
    devid    3 size 465.76GiB used 360.00GiB path /dev/bcache3
    devid    4 size 465.76GiB used 359.03GiB path /dev/bcache1
    devid    5 size 465.76GiB used 360.00GiB path /dev/bcache0
    devid    6 size 465.76GiB used 360.03GiB path /dev/bcache2
    devid    7 size 1.82TiB used 164.03GiB path /dev/bcache5

Reading files still aborts with I/O errors, syslog still shows:

BTRFS warning (device bcache0): csum failed root 5 ino 1143 off 7274496 csum 0x98f94189 expected csum 0x6340b527 mirror 1 BTRFS warning (device bcache0): csum failed root 5 ino 1143 off 7274496 csum 0xccccf554 expected csum 0x6340b527 mirror 2

So I think the most harmless action is a read-only scrub, I just started the process. Errors and warnings flood the syslog:

$ sudo btrfs scrub start -BdrR /media/raid # -B no backgroud, -d statistics per device, -rread-only, -R raw statistics per device
    $ tail -f /var/log/syslog
BTRFS error (device bcache0): bdev /dev/bcache5 errs: wr 0, rd 0, flush 0, corrupt 2848, gen 0 BTRFS warning (device bcache0): checksum error at logical 4590109331456 on dev /dev/bcache5, physical 2954104832, root 5, inode 418, offset 1030803456, length 4096, links 1 (path: VMs/Virtualbox/Windows 10 Imaging VMs/Windows 10 Imaging/Windows 10 Imaging-fixed.vdi) BTRFS error (device bcache0): bdev /dev/bcache5 errs: wr 0, rd 0, flush 0, corrupt 2849, gen 0 BTRFS warning (device bcache0): checksum error at logical 4590108811264 on dev /dev/bcache5, physical 2953977856, root 5, inode 1533, offset 93051236352, length 4096, links 1 (path: VMs/Virtualbox/vmrbreb/vmrbreb-fixed.vdi) BTRFS error (device bcache0): bdev /dev/bcache5 errs: wr 0, rd 0, flush 0, corrupt 2850, gen 0 BTRFS warning (device bcache0): checksum error at logical 4590109335552 on dev /dev/bcache5, physical 2954108928, root 5, inode 418, offset 1030807552, length 4096, links 1 (path: VMs/Virtualbox/Windows 10 Imaging VMs/Windows 10 Imaging/Windows 10 Imaging-fixed.vdi) BTRFS error (device bcache0): bdev /dev/bcache5 errs: wr 0, rd 0, flush 0, corrupt 2851, gen 0 BTRFS warning (device bcache0): checksum error at logical 4590108815360 on dev /dev/bcache5, physical 2953981952, root 5, inode 621, offset 11864412160, length 4096, links 1 (path: VMs/Virtualbox/Win102016_Alter-Firefox/Win102016_Alter-Firefox-disk1.vdi) BTRFS error (device bcache0): bdev /dev/bcache5 errs: wr 0, rd 0, flush 0, corrupt 2852, gen 0 BTRFS warning (device bcache0): checksum error at logical 4590109339648 on dev /dev/bcache5, physical 2954113024, root 5, inode 418, offset 1030811648, length 4096, links 1 (path: VMs/Virtualbox/Windows 10 Imaging VMs/Windows 10 Imaging/Windows 10 Imaging-fixed.vdi) BTRFS error (device bcache0): bdev /dev/bcache5 errs: wr 0, rd 0, flush 0, corrupt 2853, gen 0 BTRFS warning (device bcache0): checksum error at logical 4590109343744 on dev /dev/bcache5, physical 2954117120, root 5, inode 418, offset 1030815744, length 4096, links 1 (path: VMs/Virtualbox/Windows 10 Imaging VMs/Windows 10 Imaging

My questions still remain: What should I do next. A scrub? A balance? Did I do something completely wrong? How to interpret the errors and warnings from the read-only scrub, may btrfs be able to fix them?

Thank you for reading and hopefully your answers!

Reply via email to