Re: BTRFS raid6 unmountable after a couple of days of usage.

Austin S Hemmelgarn Tue, 25 Aug 2015 11:11:44 -0700

On 2015-07-16 07:49, Austin S Hemmelgarn wrote:

On 2015-07-14 07:49, Austin S Hemmelgarn wrote:

So, after experiencing this same issue multiple times (on almost a
dozen different kernel versions since 4.0) and ruling out the
possibility of it being caused by my hardware (or at least, the RAM,
SATA controller and disk drives themselves), I've decided to report it
here.


The general symptom is that raid6 profile filesystems that I have are
working fine for multiple weeks, until I either reboot or otherwise
try to remount them, at which point the system refuses to mount them.

I'm currently using btrfs-progs v4.1 with kernel 4.1.2, although I've
been seeing this with versions of both since 4.0.

Output of 'btrfs fi show' for the most recent fs that I had this issue
with:
         Label: 'altroot'  uuid: 86eef6b9-febe-4350-a316-4cb00c40bbc5
    Total devices 4 FS bytes used 9.70GiB
    devid    1 size 24.00GiB used 6.03GiB path /dev/mapper/vg-altroot.0
    devid    2 size 24.00GiB used 6.01GiB path /dev/mapper/vg-altroot.1
    devid    3 size 24.00GiB used 6.01GiB path /dev/mapper/vg-altroot.2
    devid    4 size 24.00GiB used 6.01GiB path /dev/mapper/vg-altroot.3

         btrfs-progs v4.1

Each of the individual LVS that are in the FS is just a flat chunk of
space on a separate disk from the others.

The FS itself passes btrfs check just fine (no reported errors, exit
value of 0), but the kernel refuses to mount it with the message
'open_ctree failed'.

I've run btrfs chunk recover and attached the output from that.

Here's a link to an image from 'btrfs image -c9 -w':
https://www.dropbox.com/s/pl7gs305ej65u9q/altroot.btrfs.img?dl=0
(That link will expire in 30 days, let me know if you need access to
it beyond that).

The filesystems in question all see relatively light but consistent
usage as targets for receiving daily incremental snapshots for
on-system backups (and because I know someone will mention it, yes, I
do have other backups of the data, these are just my online backups).

Secondary but possibly related issue, I'm seeing similar issues with all
data/metadata profiles when using BTRFS on top of a dm-thinp volume with
zeroing-mode turned off (that is, discard doesn't clear data from the
areas that were discarded).

Following up further on this specific issue, I've tracked this down to dm-thinp not clearing the discard_zeros_data flag on the devices when you turn off zeroing mode. I'm going to do some more digging regarding that and probably send a patch to lkml to fix it.

smime.p7s
Description: S/MIME Cryptographic Signature

Re: BTRFS raid6 unmountable after a couple of days of usage.

Reply via email to