On 2020/12/29 上午3:58, Stéphane Lesimple wrote:
I know it fails in relocate_block_group(), which returns -2, I'm currently
adding a couple printk's here and there to try to pinpoint that better.
Okay, so btrfs_relocate_block_group() starts with stage MOVE_DATA_EXTENTS, which
completes successfully, as relocate_block_group() returns 0:
BTRFS info (device <unknown>): relocate_block_group: prepare_to_realocate = 0
BTRFS info (device <unknown>): relocate_block_group loop: progress = 1,
btrfs_start_transaction = ok
[...]
BTRFS info (device <unknown>): relocate_block_group loop: progress = 168,
btrfs_start_transaction = ok
BTRFS info (device <unknown>): relocate_block_group: returning err = 0
BTRFS info (device dm-10): stage = move data extents, relocate_block_group = 0
BTRFS info (device dm-10): found 167 extents, stage: move data extents
Then it proceeds to the UPDATE_DATA_PTRS stage and calls relocate_block_group()
again. This time it'll fail at the 92th iteration of the loop:
BTRFS info (device <unknown>): relocate_block_group loop: progress = 92,
btrfs_start_transaction = ok
BTRFS info (device <unknown>): relocate_block_group loop: extents_found = 92,
item_size(53) >= sizeof(*ei)(24), flags = 1, ret = 0
BTRFS info (device <unknown>): add_data_references: btrfs_find_all_leafs = 0
BTRFS info (device <unknown>): add_data_references loop: read_tree_block ok
BTRFS info (device <unknown>): add_data_references loop: delete_v1_space_cache
= -2
Damn it, if we find no v1 space cache for the block group, it means
we're fine to continue...
BTRFS info (device <unknown>): relocate_block_group loop: add_data_references =
-2
Then the -ENOENT goes all the way up the call stack and aborts the balance.
So it fails in delete_v1_space_cache(), though it is worth noting that the
FS we're talking about is actually using space_cache v2.
Space cache v2, no wonder no v1 space cache.
Does it help? Shall I dig deeper?
You're already at the point!
Mind me to craft a fix with your signed-off-by?
Thanks,
Qu
Regards,
Stéphane.