On 2016-06-27 13:29, Chris Murphy wrote:
On Sun, Jun 26, 2016 at 10:02 PM, Nick Austin <n...@smartaustin.com> wrote:
On Sun, Jun 26, 2016 at 8:57 PM, Nick Austin <n...@smartaustin.com> wrote:
sudo btrfs fi show /mnt/newdata
Label: '/var/data'  uuid: e4a2eb77-956e-447a-875e-4f6595a5d3ec
        Total devices 4 FS bytes used 8.07TiB
        devid    1 size 5.46TiB used 2.70TiB path /dev/sdg
        devid    2 size 5.46TiB used 2.70TiB path /dev/sdl
        devid    3 size 5.46TiB used 2.70TiB path /dev/sdm
        devid    4 size 5.46TiB used 2.70TiB path /dev/sdx

It looks like fi show has bad data:

When I start heavy IO on the filesystem (running rsync -c to verify the data),
I notice zero IO on the bad drive I told btrfs to replace, and lots of IO to the
 expected replacement.

I guess some metadata is messed up somewhere?

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          25.19    0.00    7.81   28.46    0.00   38.54

Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
sdg             437.00     75168.00      1792.00      75168       1792
sdl             443.00     76064.00      1792.00      76064       1792
sdm             438.00     75232.00      1472.00      75232       1472
sdw             443.00     75680.00      1856.00      75680       1856
sdx               0.00         0.00         0.00          0          0

There's reported some bugs with 'btrfs replace' and raid56, but I
don't know the exact nature of those bugs, when or how they manifest.
It's recommended to fallback to use 'btrfs add' and then 'btrfs
delete' but you have other issues going on also.
One other thing to mention, if the device is failing, _always_ add '-r' to the replace command line. This will tell it to avoid reading from the device being replaced (in raid1 or raid10 mode, it will pull from the other mirror, in raid5/6 mode, it will recompute the block from parity and compare to the stored checksums (which in turn means that this _will_ be slower on raid5/6 than regular repalce)). Link resets and other issues that cause devices to disappear become more common the more damaged a disk is, so avoiding reading from it becomes more important too, because just reading from a disk puts stress on it.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to