To fill in for the spectators on the list :)
Su gave me a modified version of btrfsck lowmem that was able to clean
most of my filesystem.
It's not a general case solution since it had some hardcoding specific
to my filesystem problems, but still a great success.
Email quoted below, along with responses to Qu

On Tue, Jul 10, 2018 at 09:09:33AM +0800, Qu Wenruo wrote:
> 
> 
> On 2018年07月10日 01:48, Marc MERLIN wrote:
> > Success!
> > Well done Su, this is a huge improvement to the lowmem code. It went from 
> > days to less than 3 hours.
> 
> Awesome work!
> 
> > I'll paste the logs below.
> > 
> > Questions:
> > 1) I assume I first need to delete a lot of snapshots. What is the limit in 
> > your opinion?
> > 100? 150? other?
> 
> My personal recommendation is just 20. Not 150, not even 100.
 
I see. Then, I may be forced to recreate multiple filesystems anyway.
I have about 25 btrfs send/receive relationships and I have around 10
historical snapshots for each.

In the future, can't we segment extents/snapshots per subvolume, making
subvolumes mini filesystems within the bigger filesystem?

> But snapshot deletion will take time (and it's delayed, you won't know
> if something wrong happened just after "btrfs subv delete") and even
> require a healthy extent tree.
> If all extent tree errors are just false alert, that should not be a big
> problem at all.
> 
> > 
> > 2) my filesystem is somewhat misbalanced. Which balance options do you 
> > think are safe to use?
> 
> I would recommend to manually check extent tree for BLOCK_GROUP_ITEM,
> which will tell how big a block group is and how many space is used.
> And gives you an idea on which block group can be relocated.
> Then use vrange= to specify exact block group to relocation.
> 
> One example would be:
> 
> # btrfs ins dump-tree -t extent <dev> | grep -A1 BLOCK_GROUP_ITEM |\
>   tee block_group_dump
> 
> Then the output contains:
>       item 1 key (13631488 BLOCK_GROUP_ITEM 8388608) itemoff 16206 itemsize 24
>               block group used 262144 chunk_objectid 256 flags DATA
> 
> The "13631488" is the bytenr of the block group.
> The "8388608" is the length of the block group.
> The "262144" is the used bytes of the block group.
> 
> The less used space the higher priority it should be relocated. (and
> faster to relocate).
> You could write a small script to do it, or there should be some tool to
> do the calculation for you.
 
I usually use something simpler:
Label: 'btrfs_boot'  uuid: e4c1daa8-9c39-4a59-b0a9-86297d397f3b
        Total devices 1 FS bytes used 30.19GiB
        devid    1 size 79.93GiB used 78.01GiB path /dev/mapper/cryptroot

This is bad, I have 30GB of data, but 78 out of 80GB of structures full.
This is bad news and recommends a balance, correct?
If so, I always struggle as to what value I should give to dusage and
musage...

> And only relocate one block group each time, to avoid possible problem.
> 
> The last but not the least, it's highly recommend to do the relocation
> only after unused snapshots are completely deleted.
> (Or it would be super super slow to relocate)

Thank you for the advise. Hopefully this hepls someone else too, and
maybe someone can write some reallocate helper tool if I don't have the
time to do it myself.

> > 3) Should I start a scrub now (takes about 1 day) or anything else to
> > check that the filesystem is hopefully not damaged anymore?
> 
> I would normally recommend to use btrfs check, but neither mode really
> works here.
> And scrub only checks csum, doesn't check the internal cross reference
> (like content of extent tree).
> 
> Maybe Su could skip the whole extent tree check and let lowmem to check
> the fs tree only, with --check-data-csum it should be a better work than
>  scrub.

I will wait to hear back from Su, but I think the current situation is
that I still have some problems on my FS, they are just
1) not important enough to block mount rw (now it works again)
2) currently ignored by the modified btrfsck I have, but would cause
problems if I used real btrfsck.

Correct?

> > 
> > 4) should btrfs check reset the corrupt counter?
> > bdev /dev/mapper/dshelf2 errs: wr 0, rd 0, flush 0, corrupt 2, gen 0
> > for now, should I reset it manually?
> 
> It could be pretty easy to implement if not already implemented.

Seems like it's not given that Su's btrfsck --repair ran to completion
and I still have corrupt set to '2' :)

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/                       | PGP 7F55D5F27AAF9D08
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to