On Mon, Mar 22, 2021 at 3:49 PM Chris Murphy <li...@colorremedies.com> wrote: > > On Mon, Mar 22, 2021 at 12:32 AM Dave T <davestechs...@gmail.com> wrote: > > > > On Sun, Mar 21, 2021 at 2:03 PM Chris Murphy <li...@colorremedies.com> > > wrote: > > > > > > On Sat, Mar 20, 2021 at 11:54 PM Dave T <davestechs...@gmail.com> wrote: > > > > > > > > # btrfs check -r 2853787942912 /dev/mapper/xyz > > > > Opening filesystem to check... > > > > parent transid verify failed on 2853787942912 wanted 29436 found 29433 > > > > parent transid verify failed on 2853787942912 wanted 29436 found 29433 > > > > parent transid verify failed on 2853787942912 wanted 29436 found 29433 > > > > Ignoring transid failure > > > > parent transid verify failed on 2853827723264 wanted 29433 found 29435 > > > > parent transid verify failed on 2853827723264 wanted 29433 found 29435 > > > > parent transid verify failed on 2853827723264 wanted 29433 found 29435 > > > > Ignoring transid failure > > > > leaf parent key incorrect 2853827723264 > > > > ERROR: could not setup extent tree > > > > ERROR: cannot open file system > > > > > > btrfs insp dump-t -t 2853827723264 /dev/ > > > > # btrfs insp dump-t -t 2853827723264 /dev/mapper/xzy > > btrfs-progs v5.11 > > parent transid verify failed on 2853827608576 wanted 29436 found 29433 > > parent transid verify failed on 2853827608576 wanted 29436 found 29433 > > parent transid verify failed on 2853827608576 wanted 29436 found 29433 > > Ignoring transid failure > > leaf parent key incorrect 2853827608576 > > WARNING: could not setup extent tree, skipping it > > parent transid verify failed on 2853827608576 wanted 29436 found 29433 > > Ignoring transid failure > > leaf parent key incorrect 2853827608576 > > Couldn't setup device tree > > ERROR: unable to open /dev/mapper/xzy > > > > # btrfs insp dump-t -t 2853787942912 /dev/mapper/xzy > > btrfs-progs v5.11 > > parent transid verify failed on 2853827608576 wanted 29436 found 29433 > > parent transid verify failed on 2853827608576 wanted 29436 found 29433 > > parent transid verify failed on 2853827608576 wanted 29436 found 29433 > > Ignoring transid failure > > leaf parent key incorrect 2853827608576 > > WARNING: could not setup extent tree, skipping it > > parent transid verify failed on 2853827608576 wanted 29436 found 29433 > > Ignoring transid failure > > leaf parent key incorrect 2853827608576 > > Couldn't setup device tree > > ERROR: unable to open /dev/mapper/xzy > > > > # btrfs insp dump-t -t 2853827608576 /dev/mapper/xzy > > btrfs-progs v5.11 > > parent transid verify failed on 2853827608576 wanted 29436 found 29433 > > parent transid verify failed on 2853827608576 wanted 29436 found 29433 > > parent transid verify failed on 2853827608576 wanted 29436 found 29433 > > Ignoring transid failure > > leaf parent key incorrect 2853827608576 > > WARNING: could not setup extent tree, skipping it > > parent transid verify failed on 2853827608576 wanted 29436 found 29433 > > Ignoring transid failure > > leaf parent key incorrect 2853827608576 > > Couldn't setup device tree > > ERROR: unable to open /dev/mapper/xzy > > That does not look promising. I don't know whether a read-write mount > with usebackuproot will recover, or end up with problems. > > Options: > > a. btrfs check --repair > This probably fails on the same problem, it can't setup the extent tree. > > b. btrfs check --init-extent-tree > This is a heavy hammer, it might succeed, but takes a long time. On 5T > it might take double digit hours or even single digit days. It's > generally faster to just wipe the drive and restore from backups than > use init-extent-tree (I understand this *is* your backup). > > c. Setup an overlay file on device mapper, to redirect the writes from > a read-write mount with usebackup root. I think it's sufficient to > just mount, optionally write some files (empty or not), and umount. > Then do a btrfs check to see if the current tree is healthy. > https://raid.wiki.kernel.org/index.php/Recovering_a_failed_software_RAID#Making_the_harddisks_read-only_using_an_overlay_file > > That guide is a bit complex to deal with many drives with mdadm raid, > so you can simplify it for just one drive. The gist is no writes go to > the drive itself, it's treated as read-only by device-mapper (in fact > you can optionally add a pre-step with the blockdev command and > --setro to make sure the entire drive is read-only; just make sure to > make it rw once you're done testing). All the writes with this overlay > go into a loop mounted file which you intentionally just throw away > after testing. > > d. Just skip the testing and try usebackuproot with a read-write > mount. It might make things worse, but at least it's fast to test. If > it messes things up, you'll have to recreate this backup from scratch.
I took this approach. My command was simply: mount -o usebackuproot /dev/mapper/xzy /backup It appears to have succeeded because it mounted without errors. I completed a new incremental backup (with btrbk) and it finished without errors. I'll be pleased if my backup history is preserved, as appears to be the case. I will run some checks on those backup subvolumes tomorrow. Are there specific checks you would recommend? > > As for how to prevent this? I'm not sure. About the best we can do is > disable the drive write cache with a udev rule, That sounds like a suitable solution for me. Thank you for this information. BTW, I have been using BTRFS for many years. This is the first serious issue I have had, and as you said there is a large element of user error and bad luck involved in this case. > and/or raid1 with > another make/model drive, and let Btrfs detect occasional corruption > and self heal from the good copy. Another obvious way to avoid the > problem is, stop having power failures, crashes, and accidental USB > cable disconnections :) > > It's not any one thing that's the problem. It's a sequence of problems > happening in just the right (or wrong) order that causes the problem. > Bugs + mistake + bad luck = problem. > > -- > Chris Murphy