> I know it fails in relocate_block_group(), which returns -2, I'm currently > adding a couple printk's here and there to try to pinpoint that better.
Okay, so btrfs_relocate_block_group() starts with stage MOVE_DATA_EXTENTS, which completes successfully, as relocate_block_group() returns 0: BTRFS info (device <unknown>): relocate_block_group: prepare_to_realocate = 0 BTRFS info (device <unknown>): relocate_block_group loop: progress = 1, btrfs_start_transaction = ok [...] BTRFS info (device <unknown>): relocate_block_group loop: progress = 168, btrfs_start_transaction = ok BTRFS info (device <unknown>): relocate_block_group: returning err = 0 BTRFS info (device dm-10): stage = move data extents, relocate_block_group = 0 BTRFS info (device dm-10): found 167 extents, stage: move data extents Then it proceeds to the UPDATE_DATA_PTRS stage and calls relocate_block_group() again. This time it'll fail at the 92th iteration of the loop: BTRFS info (device <unknown>): relocate_block_group loop: progress = 92, btrfs_start_transaction = ok BTRFS info (device <unknown>): relocate_block_group loop: extents_found = 92, item_size(53) >= sizeof(*ei)(24), flags = 1, ret = 0 BTRFS info (device <unknown>): add_data_references: btrfs_find_all_leafs = 0 BTRFS info (device <unknown>): add_data_references loop: read_tree_block ok BTRFS info (device <unknown>): add_data_references loop: delete_v1_space_cache = -2 BTRFS info (device <unknown>): relocate_block_group loop: add_data_references = -2 Then the -ENOENT goes all the way up the call stack and aborts the balance. So it fails in delete_v1_space_cache(), though it is worth noting that the FS we're talking about is actually using space_cache v2. Does it help? Shall I dig deeper? Regards, Stéphane.