On Wed, Dec 12, 2012 at 06:52:37PM -0700, Liu Bo wrote: > An user reported that he has hit an annoying deadlock while playing with > ceph based on btrfs. > > Current updating device tree requires space from METADATA chunk, > so we -may- need to do a recursive chunk allocation when adding/updating > dev extent, that is where the deadlock comes from. > > If we use SYSTEM metadata to update device tree, we can avoid the recursive > stuff. >
This is going to cause us to allocate much more system chunks than we used to which could land us in trouble. Instead let's just keep us from re-entering if we're already allocating a chunk. We do the chunk allocation when we don't have enough space for a cluster, but we'll likely have plenty of space to make an allocation. Can you give this patch a try Jim and see if it fixes your problem? Thanks, Josef diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index e152809..59df5e7 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -3564,6 +3564,10 @@ static int do_chunk_alloc(struct btrfs_trans_handle *trans, int wait_for_alloc = 0; int ret = 0; + /* Don't re-enter if we're already allocating a chunk */ + if (trans->allocating_chunk) + return -ENOSPC; + space_info = __find_space_info(extent_root->fs_info, flags); if (!space_info) { ret = update_space_info(extent_root->fs_info, flags, @@ -3606,6 +3610,8 @@ again: goto again; } + trans->allocating_chunk = true; + /* * If we have mixed data/metadata chunks we want to make sure we keep * allocating mixed chunks instead of individual chunks. @@ -3632,6 +3638,7 @@ again: check_system_chunk(trans, extent_root, flags); ret = btrfs_alloc_chunk(trans, extent_root, flags); + trans->allocating_chunk = false; if (ret < 0 && ret != -ENOSPC) goto out; diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c index e6509b9..47ad8be 100644 --- a/fs/btrfs/transaction.c +++ b/fs/btrfs/transaction.c @@ -388,6 +388,7 @@ again: h->qgroup_reserved = qgroup_reserved; h->delayed_ref_elem.seq = 0; h->type = type; + h->allocating_chunk = false; INIT_LIST_HEAD(&h->qgroup_ref_list); INIT_LIST_HEAD(&h->new_bgs); diff --git a/fs/btrfs/transaction.h b/fs/btrfs/transaction.h index 0e8aa1e..69700f7 100644 --- a/fs/btrfs/transaction.h +++ b/fs/btrfs/transaction.h @@ -68,6 +68,7 @@ struct btrfs_trans_handle { struct btrfs_block_rsv *orig_rsv; short aborted; short adding_csums; + bool allocating_chunk; enum btrfs_trans_type type; /* * this root is only needed to validate that the root passed to -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html