On 5/4/18 1:56 AM, Qu Wenruo wrote:
> Under the following case, qgroup rescan can double account cowed tree
> blocks:
> 
> In this case, extent tree only has one tree block.
> 
> -
> | transid=5 last committed=4
> | btrfs_qgroup_rescan_worker()
> | |- btrfs_start_transaction()
> | |  transid = 5
> | |- qgroup_rescan_leaf()
> |    |- btrfs_search_slot_for_read() on extent tree
> |       Get the only extent tree block from commit root (transid = 4).
> |       Scan it, set qgroup_rescan_progress to the last
> |       EXTENT/META_ITEM + 1
> |       now qgroup_rescan_progress = A + 1.
> |
> | fs tree get CoWed, new tree block is at A + 16K
> | transid 5 get committed
> -
> | transid=6 last committed=5
> | btrfs_qgroup_rescan_worker()
> | btrfs_qgroup_rescan_worker()
> | |- btrfs_start_transaction()
> | |  transid = 5
> | |- qgroup_rescan_leaf()
> |    |- btrfs_search_slot_for_read() on extent tree
> |       Get the only extent tree block from commit root (transid = 5).
> |       scan it using qgroup_rescan_progress (A + 1).
> |       found new tree block beyong A, and it's fs tree block,
> |       account it to increase qgroup numbers.
> -
> 
> In above case, tree block A, and tree block A + 16K get accounted twice,
> while qgroup rescan should stop when it already reach the last leaf,
> other than continue using its qgroup_rescan_progress.
> 
> Such case could happen by just looping btrfs/017 and with some
> possibility it can hit such double qgroup accounting problem.
> 
> Fix it by checking the path to determine if we should finish qgroup
> rescan, other than relying on next loop to exit.
> 
> Reported-by: Nikolay Borisov <nbori...@suse.com>
> Signed-off-by: Qu Wenruo <w...@suse.com>
> ---
>  fs/btrfs/qgroup.c | 48 +++++++++++++++++++++++++++++++++--------------
>  1 file changed, 34 insertions(+), 14 deletions(-)
> 
> diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
> index 829e8fe5c97e..2ee2d21d43ab 100644
> --- a/fs/btrfs/qgroup.c
> +++ b/fs/btrfs/qgroup.c
> @@ -2579,6 +2579,21 @@ void btrfs_qgroup_free_refroot(struct btrfs_fs_info 
> *fs_info,
>       spin_unlock(&fs_info->qgroup_lock);
>  }
>  
> +/*
> + * Check if the leaf is the last leaf. Which means all node pointers
> + * are at their last position.
> + */
> +static bool is_last_leaf(struct btrfs_path *path)
> +{
> +     int i;
> +
> +     for (i = 1; i < BTRFS_MAX_LEVEL && path->nodes[i]; i++) {
> +             if (path->slots[i] != btrfs_header_nritems(path->nodes[i]) - 1)
> +                     return false;
> +     }
> +     return true;
> +}
> +
>  /*
>   * returns < 0 on error, 0 when more leafs are to be scanned.
>   * returns 1 when done.
> @@ -2592,6 +2607,7 @@ qgroup_rescan_leaf(struct btrfs_fs_info *fs_info, 
> struct btrfs_path *path,
>       struct ulist *roots = NULL;
>       struct seq_list tree_mod_seq_elem = SEQ_LIST_INIT(tree_mod_seq_elem);
>       u64 num_bytes;
> +     bool done;
>       int slot;
>       int ret;
>  
> @@ -2606,20 +2622,9 @@ qgroup_rescan_leaf(struct btrfs_fs_info *fs_info, 
> struct btrfs_path *path,
>               fs_info->qgroup_rescan_progress.type,
>               fs_info->qgroup_rescan_progress.offset, ret);
>  
> -     if (ret) {
> -             /*
> -              * The rescan is about to end, we will not be scanning any
> -              * further blocks. We cannot unset the RESCAN flag here, because
> -              * we want to commit the transaction if everything went well.
> -              * To make the live accounting work in this phase, we set our
> -              * scan progress pointer such that every real extent objectid
> -              * will be smaller.
> -              */
> -             fs_info->qgroup_rescan_progress.objectid = (u64)-1;
> -             btrfs_release_path(path);
> -             mutex_unlock(&fs_info->qgroup_rescan_lock);
> -             return ret;
> -     }
> +     done = is_last_leaf(path);
> +     if (ret)
> +             goto finish;
>  
>       btrfs_item_key_to_cpu(path->nodes[0], &found,
>                             btrfs_header_nritems(path->nodes[0]) - 1);
> @@ -2665,8 +2670,23 @@ qgroup_rescan_leaf(struct btrfs_fs_info *fs_info, 
> struct btrfs_path *path,
>               free_extent_buffer(scratch_leaf);
>       }
>       btrfs_put_tree_mod_seq(fs_info, &tree_mod_seq_elem);
> +     if (done && !ret)
> +             goto finish;

This causes a double unlock.  The lock was released prior to iterating
the leaf.  Otherwise, looks good.

-Jeff

>  
>       return ret;
> +finish:
> +     /*
> +      * The rescan is about to end, we will not be scanning any
> +      * further blocks. We cannot unset the RESCAN flag here, because
> +      * we want to commit the transaction if everything went well.
> +      * To make the live accounting work in this phase, we set our
> +      * scan progress pointer such that every real extent objectid
> +      * will be smaller.
> +      */
> +     fs_info->qgroup_rescan_progress.objectid = (u64)-1;
> +     btrfs_release_path(path);
> +     mutex_unlock(&fs_info->qgroup_rescan_lock);
> +     return 1;
>  }
>  
>  static void btrfs_qgroup_rescan_worker(struct btrfs_work *work)
> 


-- 
Jeff Mahoney
SUSE Labs

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to