BTRFS RAID5 disk failed while balancing

Oliver R. Thu, 01 Nov 2018 04:59:54 -0700

If you clicked on the link to this topic: Thank you!

I have the following setup:


6x 500GB HDD-Drives
1x 32GB NVME-SSD (Intel Optane)

I used bcache to setup up the SSD as caching device and all other sixdrives are backing devices. After all that was in place, I formatted thesix HHDs with btrfs in RAID5. Everything works as expected for the last7 months now.

By now I have a spare of 6x 2TB HDD drives and I want to replace the old500GB disks one by one. So I started with the first one by deleting itfrom the btrfs. This worked fine, I had no issues there. After that Icleanly detached the empty disk from bcache, still everything is fine,so I removed it. Here are the commandlines for this:


    sudo btrfs device delete /dev/bcacheX /media/raid
    cat /sys/block/bcacheX/bcache/state
    cat /sys/block/bcacheX/bcache/dirty_data
    sudo sh -c "echo 1 > /sys/block/bcacheX/bcache/detach"
    cat /sys/block/bcacheX/bcache/state

After that I installed one of 2TB drives, attached it to bcache andadded it to the raid. The next step was to balance the data over to thenew drive. Please see the commandlines:


    sudo make-bcache -B /dev/sdY

sudo sh -c "echo '60a63f7c-2e68-4503-9f25-71b6b00e47b2' >/sys/block/bcacheY/bcache/attach"

    sudo sh -c "echo writeback > /sys/block/bcacheY/bcache/cache_mode"
    sudo btrfs device add /dev/bcacheY /media/raid
    sudo btrfs fi ba start /media/raid/

The balance worked fine until ~164GB were written to the new drive, thisis about 50% of the data to be balanced. Suddenly write errors on thedisk appear. The Raid slowly became unusable (I was running 3 VMs of theRAID while balancing). I think it worked for some time due to the SSDcommiting the writes. At some point the balancing stopped and I was onlyable to kill the VMs. I checked the I/Os on the disks and the SSD spitout constant 1,2 GB/s read. I think the bcache somehow delivered data tothe btrfs and it got rejected there and requested again, but this isjust a guess. Anyway, I ended up resetting the host and I physicallydisconnected the broken disk and put a new one in place. I also createda bcache backing device on it and issued the following command toreplace the faulty disk:


    sudo btrfs replace start -r 7 /dev/bcache5 /media/raid

The filesystem needs to be mounted read/write for this command to work.It is now doing its work, but very slow, about 3,5 MB/s. Unfortunatelythe syslog reports a lot of these messages:


    ...
    scrub_missing_raid56_worker: 62 callbacks suppressed

BTRFS error (device bcache0): failed to rebuild valid logical4929143865344 for dev (null)

...

BTRFS error (device bcache0): failed to rebuild valid logical4932249866240 for dev (null)

    scrub_missing_raid56_worker: 1 callbacks suppressed

BTRFS error (device bcache0): failed to rebuild valid logical4933254250496 for dev (null)

    ....

If I try to read a file from the filesystem, the output-command failswith a simple I/O error and the syslog shows something entries similarto this:

BTRFS warning (device bcache0): csum failed root 5 ino 1143 off7274496 csum 0xccccf554 expected csum 0x6340b527 mirror 2

So far, so good (or bad). It took about 6 hours for 4,3% of thereplacement so far. No read or write errors have been reported for thereplacement procedure ("btrfs replace status"). I will let it to itsthing until finished. Before the first 2TB disk failed, 164 GB of datahave been written according to "btrfs filesystem show". If I check theamount of data written to the new drive, the 4,3% represent about 82 GB(according to /proc/diskstats). I don't know how to interpret this, butanyway.

And now finally my questions: If the replace command finishessuccessfully, what should I do next. A scrub? A balance? Another backup? ;-)Do you see anything that I have done wrong in this procedure? Do thewarnings and the errors reported from btrfs mean, that the data is lost? :-(


Here is some additional info (**edited**):

    $ sudo btrfs fi sh
    Total devices 7 FS bytes used 1.56TiB
    Label: none  uuid: 9f765025-5354-47e4-afcc-a601b2a52703
    devid    0 size 1.82TiB used 164.03GiB path /dev/bcache5
    devid    1 size 465.76GiB used 360.03GiB path /dev/bcache4
    devid    3 size 465.76GiB used 360.00GiB path /dev/bcache3
    devid    4 size 465.76GiB used 359.03GiB path /dev/bcache1
    devid    5 size 465.76GiB used 360.00GiB path /dev/bcache0
    devid    6 size 465.76GiB used 360.03GiB path /dev/bcache2
    *** Some devices missing

    $ sudo btrfs dev stats /media/raid/
    [/dev/bcache5].write_io_errs    0
    [/dev/bcache5].read_io_errs     0
    [/dev/bcache5].flush_io_errs    0
    [/dev/bcache5].corruption_errs  0
    [/dev/bcache5].generation_errs  0
    [/dev/bcache4].write_io_errs    0
    [/dev/bcache4].read_io_errs     0
    [/dev/bcache4].flush_io_errs    0
    [/dev/bcache4].corruption_errs  0
    [/dev/bcache4].generation_errs  0
    [/dev/bcache3].write_io_errs    0
    [/dev/bcache3].read_io_errs     0
    [/dev/bcache3].flush_io_errs    0
    [/dev/bcache3].corruption_errs  0
    [/dev/bcache3].generation_errs  0
    [/dev/bcache1].write_io_errs    0
    [/dev/bcache1].read_io_errs     0
    [/dev/bcache1].flush_io_errs    0
    [/dev/bcache1].corruption_errs  0
    [/dev/bcache1].generation_errs  0
    [/dev/bcache0].write_io_errs    0
    [/dev/bcache0].read_io_errs     0
    [/dev/bcache0].flush_io_errs    0
    [/dev/bcache0].corruption_errs  0
    [/dev/bcache0].generation_errs  0
    [/dev/bcache2].write_io_errs    0
    [/dev/bcache2].read_io_errs     0
    [/dev/bcache2].flush_io_errs    0
    [/dev/bcache2].corruption_errs  0
    [/dev/bcache2].generation_errs  0
    [devid:7].write_io_errs    9525186
    [devid:7].read_io_errs     10136573
    [devid:7].flush_io_errs    143
    [devid:7].corruption_errs  0
    [devid:7].generation_errs  0

    $ sudo btrfs fi df /media/raid/
    Data, RAID5: total=1.56TiB, used=1.55TiB
    System, RAID1: total=64.00MiB, used=128.00KiB
    Metadata, RAID1: total=4.00GiB, used=2.48GiB
    GlobalReserve, single: total=512.00MiB, used=0.00B

    $ uname -a

Linux hostname 4.15.0-36-generic #39-Ubuntu SMP Mon Sep 24 16:19:09UTC 2018 x86_64 x86_64 x86_64 GNU/Linux


    $ btrfs --version
    btrfs-progs v4.15.1

    $ cat /etc/lsb-release
    DISTRIB_ID=Ubuntu
    DISTRIB_RELEASE=18.04
    DISTRIB_CODENAME=bionic
    DISTRIB_DESCRIPTION="Ubuntu 18.04.1 LTS"

The 'device replace' just finished. It was next to the 9% mark, I thinkthis percentage matches the written amount of data on the drive: 164 GiBfrom a total size of 1,82 TiB. So 100% would mean a full 2TBreplacemant. So here some additional output:


    $ btrfs replace status -1 /media/raid/

Started on 30.Oct 08:16:53, finished on 30.Oct 21:05:22, 0 writeerrs, 0 uncorr. read errs


    $ sudo btrfs fi sh
    Label: none  uuid: 9f765025-5354-47e4-afcc-a601b2a52703
    Total devices 6 FS bytes used 1.56TiB
    devid    1 size 465.76GiB used 360.03GiB path /dev/bcache4
    devid    3 size 465.76GiB used 360.00GiB path /dev/bcache3
    devid    4 size 465.76GiB used 359.03GiB path /dev/bcache1
    devid    5 size 465.76GiB used 360.00GiB path /dev/bcache0
    devid    6 size 465.76GiB used 360.03GiB path /dev/bcache2
    devid    7 size 1.82TiB used 164.03GiB path /dev/bcache5

Reading files still aborts with I/O errors, syslog still shows:

BTRFS warning (device bcache0): csum failed root 5 ino 1143 off7274496 csum 0x98f94189 expected csum 0x6340b527 mirror 1BTRFS warning (device bcache0): csum failed root 5 ino 1143 off7274496 csum 0xccccf554 expected csum 0x6340b527 mirror 2

So I think the most harmless action is a read-only scrub, I just startedthe process. Errors and warnings flood the syslog:

$ sudo btrfs scrub start -BdrR /media/raid # -B no backgroud, -dstatistics per device, -rread-only, -R raw statistics per device

    $ tail -f /var/log/syslog

BTRFS error (device bcache0): bdev /dev/bcache5 errs: wr 0, rd 0,flush 0, corrupt 2848, gen 0BTRFS warning (device bcache0): checksum error at logical4590109331456 on dev /dev/bcache5, physical 2954104832, root 5, inode418, offset 1030803456, length 4096, links 1 (path:VMs/Virtualbox/Windows 10 Imaging VMs/Windows 10 Imaging/Windows 10Imaging-fixed.vdi)BTRFS error (device bcache0): bdev /dev/bcache5 errs: wr 0, rd 0,flush 0, corrupt 2849, gen 0BTRFS warning (device bcache0): checksum error at logical4590108811264 on dev /dev/bcache5, physical 2953977856, root 5, inode1533, offset 93051236352, length 4096, links 1 (path:VMs/Virtualbox/vmrbreb/vmrbreb-fixed.vdi)BTRFS error (device bcache0): bdev /dev/bcache5 errs: wr 0, rd 0,flush 0, corrupt 2850, gen 0BTRFS warning (device bcache0): checksum error at logical4590109335552 on dev /dev/bcache5, physical 2954108928, root 5, inode418, offset 1030807552, length 4096, links 1 (path:VMs/Virtualbox/Windows 10 Imaging VMs/Windows 10 Imaging/Windows 10Imaging-fixed.vdi)BTRFS error (device bcache0): bdev /dev/bcache5 errs: wr 0, rd 0,flush 0, corrupt 2851, gen 0BTRFS warning (device bcache0): checksum error at logical4590108815360 on dev /dev/bcache5, physical 2953981952, root 5, inode621, offset 11864412160, length 4096, links 1 (path:VMs/Virtualbox/Win102016_Alter-Firefox/Win102016_Alter-Firefox-disk1.vdi)BTRFS error (device bcache0): bdev /dev/bcache5 errs: wr 0, rd 0,flush 0, corrupt 2852, gen 0BTRFS warning (device bcache0): checksum error at logical4590109339648 on dev /dev/bcache5, physical 2954113024, root 5, inode418, offset 1030811648, length 4096, links 1 (path:VMs/Virtualbox/Windows 10 Imaging VMs/Windows 10 Imaging/Windows 10Imaging-fixed.vdi)BTRFS error (device bcache0): bdev /dev/bcache5 errs: wr 0, rd 0,flush 0, corrupt 2853, gen 0BTRFS warning (device bcache0): checksum error at logical4590109343744 on dev /dev/bcache5, physical 2954117120, root 5, inode418, offset 1030815744, length 4096, links 1 (path:VMs/Virtualbox/Windows 10 Imaging VMs/Windows 10 Imaging

My questions still remain: What should I do next. A scrub? A balance?Did I do something completely wrong? How to interpret the errors andwarnings from the read-only scrub, may btrfs be able to fix them?


Thank you for reading and hopefully your answers!

BTRFS RAID5 disk failed while balancing

Reply via email to