Re: [PATCH] Btrfs: fix race when cleaning unused block groups

Josef Bacik Wed, 05 Nov 2014 12:35:44 -0800

On 11/05/2014 02:56 PM, Filipe Manana wrote:

We have a race while deleting unused block groups that causes extents written
by past generations/transactions to be rewritten by the current transaction
before that transaction is committed. The steps that lead to this issue:


1) At transaction N one or more block groups became unused and we added them
    to the list fs_info->unused_bgs;

2) While still at transaction N we write btree extents to block group X and the
    transaction is committed;

3) The cleaner kthread is awaken and calls btrfs_delete_unused_bgs() to go 
through
    the list fs_info->unused_bgs and remove unused block groups;

4) Transaction N + 1 starts;

5) At transaction N + 1, block group X becomes unused and is added to the list
    fs_info->unused_bgs - this implies delayed refs were run, so we had the
    following function calls: btrfs_run_delayed_refs() -> __btrfs_free_extent()
    -> update_block_group(). The update_block_group() function grabs the lock
    fs_info->unused_bgs_lock, adds block group X to fs_info->unused_bgs and
    releases that lock;

6) The cleaner kthread, while at btrfs_delete_unused_bgs(), sees block group X
    added by transaction N + 1 because it's doing a loop that finishes only when
    the list fs_info->unused_bgs is empty and locks and unlocks the spinlock
    fs_info->unused_bgs_lock on each iteration. So it deletes the block group
    and its respective chunk is released. Even if it didn't do the lock/unlock
    per iteration, it could still see block group X in the list, because the
    cleaner kthread might call btrfs_delete_unused_bgs() multiple times (for
    example if there are several snapshots to delete);

7) A new block group X' is created for data, and it's associated to the same 
chunk
    that block group X was associated to;

Actually this can't happen, we search the commit root for a free devextent, so if block group X` get's mapped to a dev extent that wasdeleted in the same transaction as it was free'd in then that is adifferent problem.

8) Extents from block group X' are allocated for file data and for example an 
fsync
    makes the file data be effectively written to disk;

Also if a new block group is allocated fsync() will trigger a fulltransaction commit. So thinking about this more I'm not entirely surethere is actually a problem here. Did you observe this issue? Are yousure it's because of this change and not just exacerbated by it? Thanks,


Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] Btrfs: fix race when cleaning unused block groups

Reply via email to