On Tue, Dec 29, 2020 at 09:29:34PM +0800, Qu Wenruo wrote:
> [BUG]
> There are several bug reports about recent kernel unable to relocate
> certain data block groups.
> 
> Sometimes the error just go away, but there is one reporter who can
> reproduce it reliably.
> 
> The dmesg would look like:
> [  438.260483] BTRFS info (device dm-10): balance: start 
> -dvrange=34625344765952..34625344765953
> [  438.269018] BTRFS info (device dm-10): relocating block group 
> 34625344765952 flags data|raid1
> [  450.439609] BTRFS info (device dm-10): found 167 extents, stage: move data 
> extents
> [  463.501781] BTRFS info (device dm-10): balance: ended with status: -2
> 
> [CAUSE]
> The -ENOENT error is returned from the following chall chain:
> 
> add_data_references()
> |- delete_v1_space_cache();
>    |- if (!found)
>          return -ENOENT;
> 
> The variable @found is set to true if we find a data extent whose
> disk bytenr matches parameter @data_bytes.
> 
> With extra debug, the offending tree block looks like this:
>   leaf bytenr = 42676709441536, data_bytenr = 34626327621632
> 
>                 ctime 1567904822.739884119 (2019-09-08 03:07:02)
>                 mtime 0.0 (1970-01-01 01:00:00)
>                 otime 0.0 (1970-01-01 01:00:00)
>         item 27 key (51933 EXTENT_DATA 0) itemoff 9854 itemsize 53
>                 generation 1517381 type 2 (prealloc)
>                 prealloc data disk byte 34626327621632 nr 262144 <<<
>                 prealloc data offset 0 nr 262144
>         item 28 key (52262 ROOT_ITEM 0) itemoff 9415 itemsize 439
>                 generation 2618893 root_dirid 256 bytenr 42677048360960 level 
> 3 refs 1
>                 lastsnap 2618893 byte_limit 0 bytes_used 5557338112 flags 
> 0x0(none)
>                 uuid d0d4361f-d231-6d40-8901-fe506e4b2b53
> 
> Although item 27 has disk bytenr 34626327621632, which matches the
> data_bytenr, its type is prealloc, not reg.
> This makes the existing code skip that item, and return -ENOENT.
> 
> [FIX]
> The code is modified in commit  19b546d7a1b2 ("btrfs: relocation: Use
> btrfs_find_all_leafs to locate data extent parent tree leaves"), before
> that commit, we use something like
> "if (type == BTRFS_FILE_EXTENT_INLINE) continue;".
> 
> But in that offending commit, we use (type == BTRFS_FILE_EXTENT_REG),
> ignoring BTRFS_FILE_EXTENT_PREALLOC.
> 
> Fix it by also checking BTRFS_FILE_EXTENT_PREALLOC.
> 
> Reported-by: Stéphane Lesimple <stephane_btr...@lesimple.fr>
> Fixes: 19b546d7a1b2 ("btrfs: relocation: Use btrfs_find_all_leafs to locate 
> data extent parent tree leaves")
> Signed-off-by: Qu Wenruo <w...@suse.com>

Thank you all for tracking down the bug, added to misc-next.

Reply via email to