On Wed, Dec 12, 2012 at 06:52:37PM -0700, Liu Bo wrote:
> An user reported that he has hit an annoying deadlock while playing with
> ceph based on btrfs.
> 
> Current updating device tree requires space from METADATA chunk,
> so we -may- need to do a recursive chunk allocation when adding/updating
> dev extent, that is where the deadlock comes from.
> 
> If we use SYSTEM metadata to update device tree, we can avoid the recursive
> stuff.
> 

This is going to cause us to allocate much more system chunks than we used to
which could land us in trouble.  Instead let's just keep us from re-entering if
we're already allocating a chunk.  We do the chunk allocation when we don't have
enough space for a cluster, but we'll likely have plenty of space to make an
allocation.  Can you give this patch a try Jim and see if it fixes your problem?
Thanks,

Josef


diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index e152809..59df5e7 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3564,6 +3564,10 @@ static int do_chunk_alloc(struct btrfs_trans_handle 
*trans,
        int wait_for_alloc = 0;
        int ret = 0;
 
+       /* Don't re-enter if we're already allocating a chunk */
+       if (trans->allocating_chunk)
+               return -ENOSPC;
+
        space_info = __find_space_info(extent_root->fs_info, flags);
        if (!space_info) {
                ret = update_space_info(extent_root->fs_info, flags,
@@ -3606,6 +3610,8 @@ again:
                goto again;
        }
 
+       trans->allocating_chunk = true;
+
        /*
         * If we have mixed data/metadata chunks we want to make sure we keep
         * allocating mixed chunks instead of individual chunks.
@@ -3632,6 +3638,7 @@ again:
        check_system_chunk(trans, extent_root, flags);
 
        ret = btrfs_alloc_chunk(trans, extent_root, flags);
+       trans->allocating_chunk = false;
        if (ret < 0 && ret != -ENOSPC)
                goto out;
 
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index e6509b9..47ad8be 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -388,6 +388,7 @@ again:
        h->qgroup_reserved = qgroup_reserved;
        h->delayed_ref_elem.seq = 0;
        h->type = type;
+       h->allocating_chunk = false;
        INIT_LIST_HEAD(&h->qgroup_ref_list);
        INIT_LIST_HEAD(&h->new_bgs);
 
diff --git a/fs/btrfs/transaction.h b/fs/btrfs/transaction.h
index 0e8aa1e..69700f7 100644
--- a/fs/btrfs/transaction.h
+++ b/fs/btrfs/transaction.h
@@ -68,6 +68,7 @@ struct btrfs_trans_handle {
        struct btrfs_block_rsv *orig_rsv;
        short aborted;
        short adding_csums;
+       bool allocating_chunk;
        enum btrfs_trans_type type;
        /*
         * this root is only needed to validate that the root passed to
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to