On 2018/9/26 下午10:29, David Sterba wrote: > On Tue, Sep 11, 2018 at 01:38:14PM +0800, Qu Wenruo wrote: >> Introduce new function, qgroup_trace_new_subtree_blocks(), to iterate >> all new tree blocks in a reloc tree. >> So that qgroup could skip unrelated tree blocks during balance, which >> should hugely speedup balance speed when quota is enabled. >> >> The function qgroup_trace_new_subtree_blocks() itself only cares about >> new tree blocks in reloc tree. >> >> All its main works are: >> >> 1) Read out tree blocks according to parent pointers >> >> 2) Do recursive depth-first search >> Will call the same function on all its children tree blocks, with >> search level set to current level -1. >> And will also skip all children whose generation is smaller than >> @last_snapshot. >> >> 3) Call qgroup_trace_extent_swap() to trace tree blocks >> >> So although we have parameter list related to source file tree, it's not >> used at all, but only passed to qgroup_trace_extent_swap(). >> Thus despite the tree read code, the core should be pretty short and all >> about recursive depth-first search. >> >> Signed-off-by: Qu Wenruo <w...@suse.com> >> --- >> fs/btrfs/qgroup.c | 114 ++++++++++++++++++++++++++++++++++++++++++++++ >> 1 file changed, 114 insertions(+) >> >> diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c >> index 5155fb42ce79..0702953d70a7 100644 >> --- a/fs/btrfs/qgroup.c >> +++ b/fs/btrfs/qgroup.c >> @@ -1839,6 +1839,120 @@ static int qgroup_trace_extent_swap(struct >> btrfs_trans_handle* trans, >> return ret; >> } >> >> +/* >> + * Helper function to do recursive generation aware depth-first search, to >> + * locate all new tree blocks in a subtree of tree reloc tree. >> + * >> + * E.g. (OO = Old tree blocks, NN = New tree blocks, whose gen == >> last_snapshot) >> + * Tree reloc tree >> + * L2 NN (a) >> + * / \ >> + * L1 OO NN (b) >> + * / \ / \ >> + * L0 OO OO OO NN >> + * (c) (d) >> + * If we pass: >> + * @dst_path = [ nodes[1] = NN(b), nodes[0] = NULL ], >> + * @cur_level = 1 >> + * @root_level = 1 >> + * >> + * We will iterate through tree blocks NN(b), NN(d) and info qgroup to trace >> + * above tree blocks along with their counter parts in file tree. >> + * While during search, old tree blocsk OO(c) will be skiped as tree block >> swap >> + * won't affect OO(c). >> + */ >> +static int qgroup_trace_new_subtree_blocks(struct btrfs_trans_handle* trans, >> + struct extent_buffer *src_eb, >> + struct btrfs_path *dst_path, >> + int cur_level, int root_level, >> + u64 last_snapshot) >> +{ >> + struct btrfs_fs_info *fs_info = trans->fs_info; >> + struct extent_buffer *eb; >> + bool need_cleanup = false; >> + int ret = 0; >> + int i; > > This function could be called recursively based on the value of > cur_level so please add some sanity checks so this does not accidentally > escape. It needs to be a plain if/return, not just an assert or bug-on.
OK, I'll add a cur_level check here, and also enhance tree-checker to do a level check for nodes/leaves, so we could catch level problem even earlier, and keep all level passed into this function is valid. Thanks, Qu > >> + >> + /* Read the tree block if needed */ >> + if (dst_path->nodes[cur_level] == NULL) { >> + struct btrfs_key first_key; >> + int parent_slot; >> + u64 child_gen; >> + u64 child_bytenr; >> + >> + /* >> + * We need to get child blockptr/gen from parent before >> + * we can read it. >> + */ >> + eb = dst_path->nodes[cur_level + 1]; >> + parent_slot = dst_path->slots[cur_level + 1]; >> + child_bytenr = btrfs_node_blockptr(eb, parent_slot); >> + child_gen = btrfs_node_ptr_generation(eb, parent_slot); >> + btrfs_node_key_to_cpu(eb, &first_key, parent_slot); >> + >> + /* This node is old, no need to trace */ >> + if (child_gen < last_snapshot) >> + goto out; >> + >> + eb = read_tree_block(fs_info, child_bytenr, child_gen, >> + cur_level, &first_key); >> + if (IS_ERR(eb)) { >> + ret = PTR_ERR(eb); >> + goto out; >> + } else if (!extent_buffer_uptodate(eb)) { >> + free_extent_buffer(eb); >> + ret = -EIO; >> + goto out; >> + } >> + >> + dst_path->nodes[cur_level] = eb; >> + dst_path->slots[cur_level] = 0; >> + >> + btrfs_tree_read_lock(eb); >> + btrfs_set_lock_blocking_rw(eb, BTRFS_READ_LOCK); >> + dst_path->locks[cur_level] = BTRFS_READ_LOCK_BLOCKING; >> + need_cleanup = true; >> + } >> + >> + /* Now record this tree block and its counter part for qgroups */ >> + ret = qgroup_trace_extent_swap(trans, src_eb, dst_path, cur_level, >> + root_level); >> + if (ret < 0) >> + goto cleanup; >> + >> + eb = dst_path->nodes[cur_level]; >> + >> + if (cur_level > 0) { >> + /* Iterate all children tree blocks */ >> + for (i = 0; i < btrfs_header_nritems(eb); i++) { >> + /* Skip old tree blocks as they won't be swapped */ >> + if (btrfs_node_ptr_generation(eb, i) < last_snapshot) >> + continue; >> + dst_path->slots[cur_level] = i; >> + >> + /* Recursive call (at most 7 times) */ >> + ret = qgroup_trace_new_subtree_blocks(trans, src_eb, >> + dst_path, cur_level - 1, root_level, >> + last_snapshot); >> + if (ret < 0) >> + goto cleanup; >> + } >> + } >> + >> +cleanup: >> + if (need_cleanup) { >> + /* Clean up */ >> + btrfs_tree_unlock_rw(dst_path->nodes[cur_level], >> + dst_path->locks[cur_level]); >> + free_extent_buffer(dst_path->nodes[cur_level]); >> + dst_path->nodes[cur_level] = NULL; >> + dst_path->slots[cur_level] = 0; >> + dst_path->locks[cur_level] = 0; >> + } >> +out: >> + return ret; >> +} >> + >> int btrfs_qgroup_trace_subtree(struct btrfs_trans_handle *trans, >> struct extent_buffer *root_eb, >> u64 root_gen, int root_level) >> -- >> 2.18.0
signature.asc
Description: OpenPGP digital signature