On 11/05/2014 04:03 PM, Filipe David Manana wrote:
On Wed, Nov 5, 2014 at 8:33 PM, Josef Bacik <jba...@fb.com> wrote:
On 11/05/2014 02:56 PM, Filipe Manana wrote:

We have a race while deleting unused block groups that causes extents
written
by past generations/transactions to be rewritten by the current
transaction
before that transaction is committed. The steps that lead to this issue:

1) At transaction N one or more block groups became unused and we added
them
     to the list fs_info->unused_bgs;

2) While still at transaction N we write btree extents to block group X
and the
     transaction is committed;

3) The cleaner kthread is awaken and calls btrfs_delete_unused_bgs() to go
through
     the list fs_info->unused_bgs and remove unused block groups;

4) Transaction N + 1 starts;

5) At transaction N + 1, block group X becomes unused and is added to the
list
     fs_info->unused_bgs - this implies delayed refs were run, so we had
the
     following function calls: btrfs_run_delayed_refs() ->
__btrfs_free_extent()
     -> update_block_group(). The update_block_group() function grabs the
lock
     fs_info->unused_bgs_lock, adds block group X to fs_info->unused_bgs
and
     releases that lock;

6) The cleaner kthread, while at btrfs_delete_unused_bgs(), sees block
group X
     added by transaction N + 1 because it's doing a loop that finishes
only when
     the list fs_info->unused_bgs is empty and locks and unlocks the
spinlock
     fs_info->unused_bgs_lock on each iteration. So it deletes the block
group
     and its respective chunk is released. Even if it didn't do the
lock/unlock
     per iteration, it could still see block group X in the list, because
the
     cleaner kthread might call btrfs_delete_unused_bgs() multiple times
(for
     example if there are several snapshots to delete);

7) A new block group X' is created for data, and it's associated to the
same chunk
     that block group X was associated to;


Actually this can't happen, we search the commit root for a free dev extent,
so if block group X` get's mapped to a dev extent that was deleted in the
same transaction as it was free'd in then that is a different problem.

Hum, yep I missed the detail that when looking for free device extents
we use the commit root.
In that case I don't think anymore that there's a problem.

Thanks for looking at it.


I still think its a good idea just so we don't have a lot of churn, but change up the commit message to make it sound less like the original stuff would eat children. Thanks,

Josef

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to