>> Mind me to craft a fix with your signed-off-by?

Sure!

> The problem is more complex than I thought, but still we at least have
> some workaround.
> 
> Firstly, this happens when an old fs get v2 space cache enabled, but
> still has v1 space cache left.
> 
> Newer v2 mount should cleanup v1 properly, but older kernel doesn't do
> the proper cleaning, thus left some v1 cache.
> 
> Then we call btrfs balance on such old fs, leading to the -ENOENT error.
> We can't ignore the error, as we have no way to relocate such left over
> v1 cache (normally we delete it completely, but with v2 cache, we can't).
> 
> So what I can do is only to add a warning message to the problem.
> 
> To solve your problem, I also submitted a patch to btrfs-progs, to force
> v1 space cache cleaning even if the fs has v2 space cache enabled.
> 
> Or, you can disable v2 space cache first, using "btrfs check
> --clear-space-cache v2" first, then "btrfs check --clear-space_cache
> v1", and finally mount the fs with "space_cache=v2" again.
> 
> To verify there is no space cache v1 left, you can run the following
> command to verify:
> 
> # btrfs ins dump-tree -t root <device> | grep EXTENT_DATA
> 
> It should output nothing.
> 
> Then please try if you can balance all your data.

Your analysis is correct, I do have v1 leftovers as I commented on
the [PATCH] you've sent.

Now, fixing the FS:
# btrfs check --clear-space-cache v2 /dev/mapper/luks-tank-mdata
Opening filesystem to check...
Checking filesystem on /dev/mapper/luks-tank-mdata
UUID: 428b20da-dcb1-403e-b407-ba984fd07ebd
Clear free space cache v2
Segmentation fault

Wow, okay. That's unexpected.

# btrfs --version
btrfs-progs v5.9 

(gdb) r
Starting program: /usr/local/bin/btrfs check --clear-space-cache v2 
/dev/mapper/luks-tank-mdata
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Opening filesystem to check...
Checking filesystem on /dev/mapper/luks-tank-mdata
UUID: 428b20da-dcb1-403e-b407-ba984fd07ebd
Clear free space cache v2

Program received signal SIGSEGV, Segmentation fault.
balance_level (level=<optimized out>, path=0x555555649490, root=0x555555645da0, 
trans=<optimized out>) at kernel-shared/ctree.c:930
930                             root_sub_used(root, right->len);
(gdb) bt
#0  balance_level (level=<optimized out>, path=0x555555649490, 
root=0x555555645da0, trans=<optimized out>) at kernel-shared/ctree.c:930
#1  btrfs_search_slot (trans=trans@entry=0x55555e8b4d30, 
root=root@entry=0x555555645da0, key=key@entry=0x7fffffffe000, 
p=p@entry=0x555555649490, ins_len=ins_len@entry=-1, cow=cow@entry=1)
    at kernel-shared/ctree.c:1320
#2  0x00005555555e3da7 in clear_free_space_tree (root=0x555555645da0, 
trans=0x55555e8b4d30) at kernel-shared/free-space-tree.c:1161
#3  btrfs_clear_free_space_tree (fs_info=<optimized out>) at 
kernel-shared/free-space-tree.c:1201
#4  0x000055555558cd5f in do_clear_free_space_cache 
(clear_version=clear_version@entry=2) at check/main.c:9872
#5  0x000055555559acce in cmd_check (cmd=0x555555638900 <cmd_struct_check>, 
argc=<optimized out>, argv=0x7fffffffe490) at check/main.c:10194
#6  0x000055555556ae88 in cmd_execute (argv=0x7fffffffe490, argc=4, 
cmd=0x555555638900 <cmd_struct_check>) at cmds/commands.h:125
#7  main (argc=4, argv=0x7fffffffe490) at btrfs.c:402
(gdb) 

Can v1 leftovers provoke this?

The patch you've sent for btrfs-progs might fix my problem as I wouldn't need
to remove space_cache v2 first, so I may not hit this bug, but if you're 
interested
in looking into this one too, we might kill one bird with two stones!

I'm leaving my FS as is waiting for your reply,

Regards,

Stéphane.

Reply via email to