On 2019/10/1 上午1:51, David Sterba wrote: > On Thu, Sep 26, 2019 at 02:35:45PM +0800, Qu Wenruo wrote: >> [BUG] >> There is one BUG_ON() report where a transaction is aborted during >> balance, then kernel BUG_ON() in merge_reloc_roots(): > > Do you have details from the report, eg. what's the error code?
Nope, what I get is only kernel message. Not enough to get the error code. > >> void merge_reloc_roots(struct reloc_control *rc) >> { >> ... >> BUG_ON(!RB_EMPTY_ROOT(&rc->reloc_root_tree.rb_root)); <<< >> } >> >> [CAUSE] >> It's still uncertain why we can get to such situation. >> As all __add_reloc_root() calls will also link that reloc root to >> rc->reloc_roots, and in merge_reloc_roots() we cleanup rc->reloc_roots. >> >> So the root cause is still uncertain. >> >> [FIX] >> But we can still hunt down all the BUG_ON() in merge_reloc_roots(). >> >> There are 3 BUG_ON()s in it: >> - BUG_ON() for read_fs_root() result >> - BUG_ON() for root->reloc_root != reloc_root case >> - BUG_ON() for the non-empty reloc_root_tree > > relocation.c is worst regarding number of BUG_ONs, some of them look > like runtime assertions of the invariants but some of them are > substituting for error handling. Yeah, if we inject kmalloc error into btrfs_relocate_block_group(), it's too easy to hit BUG_ON(). > > The first one BUG_ON(IS_ERR(root)); is clearly the latter, the other are > assertions > >> >> For the first two, just grab the return value and goto out tag for >> cleanup. >> >> For the last one, make it more graceful by: >> - grab the lock before doing read/write >> - warn instead of panic >> - cleanup the nodes in rc->reloc_root_tree >> >> Signed-off-by: Qu Wenruo <w...@suse.com> >> --- >> Reason for RFC: >> The root cause to leak nodes in reloc_root_tree is still unknown. >> --- >> fs/btrfs/relocation.c | 39 ++++++++++++++++++++++++++++++++++++--- >> 1 file changed, 36 insertions(+), 3 deletions(-) >> >> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c >> index 655f1d5a8c27..d562b5c52a40 100644 >> --- a/fs/btrfs/relocation.c >> +++ b/fs/btrfs/relocation.c >> @@ -2484,11 +2484,26 @@ void merge_reloc_roots(struct reloc_control *rc) >> if (btrfs_root_refs(&reloc_root->root_item) > 0) { >> root = read_fs_root(fs_info, >> reloc_root->root_key.offset); >> - BUG_ON(IS_ERR(root)); >> - BUG_ON(root->reloc_root != reloc_root); >> + if (IS_ERR(root)) { > > This is bug_on -> error handling, ok > >> + ret = PTR_ERR(root); >> + btrfs_err(fs_info, >> + "failed to read root %llu: %d", >> + reloc_root->root_key.offset, ret); >> + goto out; >> + } >> + if (root->reloc_root != reloc_root) { > > With this one I'm not sure it could happen but replacing the bug on is > still good. > >> + btrfs_err(fs_info, >> + "reloc root mismatch for root %llu", > > Would be good to print the number of the other root as well. > >> + root->root_key.objectid); >> + ret = -EINVAL; >> + goto out; >> + } >> >> ret = merge_reloc_root(rc, root); >> if (ret) { >> + btrfs_err(fs_info, >> + "failed to merge reloc tree for root %llu: %d", >> + root->root_key.objectid, ret); >> if (list_empty(&reloc_root->root_list)) >> list_add_tail(&reloc_root->root_list, >> &reloc_roots); >> @@ -2520,7 +2535,25 @@ void merge_reloc_roots(struct reloc_control *rc) >> free_reloc_roots(&reloc_roots); >> } >> >> - BUG_ON(!RB_EMPTY_ROOT(&rc->reloc_root_tree.rb_root)); > > This one looks more like the invariant, the tree should be really empty > here. While the cleanup is trying to make things work despite there's a > problem, I think the warning should not be debugging only. Indeed, we should output the warning no matter whatever. Thanks, Qu > >> + spin_lock(&rc->reloc_root_tree.lock); >> + /* Cleanup dirty reloc tree nodes */ >> + if (!RB_EMPTY_ROOT(&rc->reloc_root_tree.rb_root)) { >> + struct mapping_node *node; >> + struct mapping_node *next; >> + >> + WARN_ON(IS_ENABLED(CONFIG_BTRFS_DEBUG)); > > ... >
signature.asc
Description: OpenPGP digital signature