On 2018/9/26 下午10:29, David Sterba wrote:
> On Tue, Sep 11, 2018 at 01:38:14PM +0800, Qu Wenruo wrote:
>> Introduce new function, qgroup_trace_new_subtree_blocks(), to iterate
>> all new tree blocks in a reloc tree.
>> So that qgroup could skip unrelated tree blocks during balance, which
>> should hugely speedup balance speed when quota is enabled.
>>
>> The function qgroup_trace_new_subtree_blocks() itself only cares about
>> new tree blocks in reloc tree.
>>
>> All its main works are:
>>
>> 1) Read out tree blocks according to parent pointers
>>
>> 2) Do recursive depth-first search
>>    Will call the same function on all its children tree blocks, with
>>    search level set to current level -1.
>>    And will also skip all children whose generation is smaller than
>>    @last_snapshot.
>>
>> 3) Call qgroup_trace_extent_swap() to trace tree blocks
>>
>> So although we have parameter list related to source file tree, it's not
>> used at all, but only passed to qgroup_trace_extent_swap().
>> Thus despite the tree read code, the core should be pretty short and all
>> about recursive depth-first search.
>>
>> Signed-off-by: Qu Wenruo <w...@suse.com>
>> ---
>>  fs/btrfs/qgroup.c | 114 ++++++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 114 insertions(+)
>>
>> diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
>> index 5155fb42ce79..0702953d70a7 100644
>> --- a/fs/btrfs/qgroup.c
>> +++ b/fs/btrfs/qgroup.c
>> @@ -1839,6 +1839,120 @@ static int qgroup_trace_extent_swap(struct 
>> btrfs_trans_handle* trans,
>>      return ret;
>>  }
>>  
>> +/*
>> + * Helper function to do recursive generation aware depth-first search, to
>> + * locate all new tree blocks in a subtree of tree reloc tree.
>> + *
>> + * E.g. (OO = Old tree blocks, NN = New tree blocks, whose gen == 
>> last_snapshot)
>> + *       Tree reloc tree
>> + * L2         NN (a)
>> + *          /    \
>> + * L1    OO        NN (b)
>> + *      /  \      /  \
>> + * L0  OO  OO    OO  NN
>> + *               (c) (d)
>> + * If we pass:
>> + * @dst_path = [ nodes[1] = NN(b), nodes[0] = NULL ],
>> + * @cur_level = 1
>> + * @root_level = 1
>> + *
>> + * We will iterate through tree blocks NN(b), NN(d) and info qgroup to trace
>> + * above tree blocks along with their counter parts in file tree.
>> + * While during search, old tree blocsk OO(c) will be skiped as tree block 
>> swap
>> + * won't affect OO(c).
>> + */
>> +static int qgroup_trace_new_subtree_blocks(struct btrfs_trans_handle* trans,
>> +                                       struct extent_buffer *src_eb,
>> +                                       struct btrfs_path *dst_path,
>> +                                       int cur_level, int root_level,
>> +                                       u64 last_snapshot)
>> +{
>> +    struct btrfs_fs_info *fs_info = trans->fs_info;
>> +    struct extent_buffer *eb;
>> +    bool need_cleanup = false;
>> +    int ret = 0;
>> +    int i;
> 
> This function could be called recursively based on the value of
> cur_level so please add some sanity checks so this does not accidentally
> escape. It needs to be a plain if/return, not just an assert or bug-on.

OK, I'll add a cur_level check here, and also enhance tree-checker to do
a level check for nodes/leaves, so we could catch level problem even
earlier, and keep all level passed into this function is valid.

Thanks,
Qu

> 
>> +
>> +    /* Read the tree block if needed */
>> +    if (dst_path->nodes[cur_level] == NULL) {
>> +            struct btrfs_key first_key;
>> +            int parent_slot;
>> +            u64 child_gen;
>> +            u64 child_bytenr;
>> +
>> +            /*
>> +             * We need to get child blockptr/gen from parent before
>> +             * we can read it.
>> +              */
>> +            eb = dst_path->nodes[cur_level + 1];
>> +            parent_slot = dst_path->slots[cur_level + 1];
>> +            child_bytenr = btrfs_node_blockptr(eb, parent_slot);
>> +            child_gen = btrfs_node_ptr_generation(eb, parent_slot);
>> +            btrfs_node_key_to_cpu(eb, &first_key, parent_slot);
>> +
>> +            /* This node is old, no need to trace */
>> +            if (child_gen < last_snapshot)
>> +                    goto out;
>> +
>> +            eb = read_tree_block(fs_info, child_bytenr, child_gen,
>> +                                 cur_level, &first_key);
>> +            if (IS_ERR(eb)) {
>> +                    ret = PTR_ERR(eb);
>> +                    goto out;
>> +            } else if (!extent_buffer_uptodate(eb)) {
>> +                    free_extent_buffer(eb);
>> +                    ret = -EIO;
>> +                    goto out;
>> +            }
>> +
>> +            dst_path->nodes[cur_level] = eb;
>> +            dst_path->slots[cur_level] = 0;
>> +
>> +            btrfs_tree_read_lock(eb);
>> +            btrfs_set_lock_blocking_rw(eb, BTRFS_READ_LOCK);
>> +            dst_path->locks[cur_level] = BTRFS_READ_LOCK_BLOCKING;
>> +            need_cleanup = true;
>> +    }
>> +
>> +    /* Now record this tree block and its counter part for qgroups */
>> +    ret = qgroup_trace_extent_swap(trans, src_eb, dst_path, cur_level,
>> +                                   root_level);
>> +    if (ret < 0)
>> +            goto cleanup;
>> +
>> +    eb = dst_path->nodes[cur_level];
>> +
>> +    if (cur_level > 0) {
>> +            /* Iterate all children tree blocks */
>> +            for (i = 0; i < btrfs_header_nritems(eb); i++) {
>> +                    /* Skip old tree blocks as they won't be swapped */
>> +                    if (btrfs_node_ptr_generation(eb, i) < last_snapshot)
>> +                            continue;
>> +                    dst_path->slots[cur_level] = i;
>> +
>> +                    /* Recursive call (at most 7 times) */
>> +                    ret = qgroup_trace_new_subtree_blocks(trans, src_eb,
>> +                                    dst_path, cur_level - 1, root_level,
>> +                                    last_snapshot);
>> +                    if (ret < 0)
>> +                            goto cleanup;
>> +            }
>> +    }
>> +
>> +cleanup:
>> +    if (need_cleanup) {
>> +            /* Clean up */
>> +            btrfs_tree_unlock_rw(dst_path->nodes[cur_level],
>> +                                 dst_path->locks[cur_level]);
>> +            free_extent_buffer(dst_path->nodes[cur_level]);
>> +            dst_path->nodes[cur_level] = NULL;
>> +            dst_path->slots[cur_level] = 0;
>> +            dst_path->locks[cur_level] = 0;
>> +    }
>> +out:
>> +    return ret;
>> +}
>> +
>>  int btrfs_qgroup_trace_subtree(struct btrfs_trans_handle *trans,
>>                             struct extent_buffer *root_eb,
>>                             u64 root_gen, int root_level)
>> -- 
>> 2.18.0

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to