On Thu, Oct 22, 2015 at 3:58 PM, Stéphane Lesimple <[email protected]> wrote: > Le 2015-10-22 11:47, Filipe Manana a écrit : >> >> On Thu, Oct 22, 2015 at 10:43 AM, Filipe Manana <[email protected]> >> wrote: >>> >>> On Thu, Oct 22, 2015 at 10:32 AM, Qu Wenruo <[email protected]> >>> wrote: >>>> >>>> >>>> >>>> wrote on 2015/10/22 09:47 +0100: >>>>> >>>>> >>>>> From: Filipe Manana <[email protected]> >>>>> >>>>> In the kernel 4.2 merge window we had a refactoring/rework of the >>>>> delayed >>>>> references implementation in order to fix certain problems with >>>>> qgroups. >>>>> However that rework introduced one more regression that leads to the >>>>> following trace when running delayed references for metadata: >>>>> >>>>> [35908.064664] kernel BUG at fs/btrfs/extent-tree.c:1832! >>>>> [35908.065201] invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC >>>>> [35908.065201] Modules linked in: dm_flakey dm_mod btrfs crc32c_generic >>>>> xor raid6_pq nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace >>>>> fscache >>>>> sunrpc loop fuse parport_pc psmouse i2 >>>>> [35908.065201] CPU: 14 PID: 15014 Comm: kworker/u32:9 Tainted: G >>>>> W >>>>> 4.3.0-rc5-btrfs-next-17+ #1 >>>>> [35908.065201] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), >>>>> BIOS >>>>> rel-1.8.1-0-g4adadbd-20150316_085822-nilsson.home.kraxel.org 04/01/2014 >>>>> [35908.065201] Workqueue: btrfs-extent-refs btrfs_extent_refs_helper >>>>> [btrfs] >>>>> [35908.065201] task: ffff880114b7d780 ti: ffff88010c4c8000 task.ti: >>>>> ffff88010c4c8000 >>>>> [35908.065201] RIP: 0010:[<ffffffffa04928b5>] [<ffffffffa04928b5>] >>>>> insert_inline_extent_backref+0x52/0xb1 [btrfs] >>>>> [35908.065201] RSP: 0018:ffff88010c4cbb08 EFLAGS: 00010293 >>>>> [35908.065201] RAX: 0000000000000000 RBX: ffff88008a661000 RCX: >>>>> 0000000000000000 >>>>> [35908.065201] RDX: ffffffffa04dd58f RSI: 0000000000000001 RDI: >>>>> 0000000000000000 >>>>> [35908.065201] RBP: ffff88010c4cbb40 R08: 0000000000001000 R09: >>>>> ffff88010c4cb9f8 >>>>> [35908.065201] R10: 0000000000000000 R11: 000000000000002c R12: >>>>> 0000000000000000 >>>>> [35908.065201] R13: ffff88020a74c578 R14: 0000000000000000 R15: >>>>> 0000000000000000 >>>>> [35908.065201] FS: 0000000000000000(0000) GS:ffff88023edc0000(0000) >>>>> knlGS:0000000000000000 >>>>> [35908.065201] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b >>>>> [35908.065201] CR2: 00000000015e8708 CR3: 0000000102185000 CR4: >>>>> 00000000000006e0 >>>>> [35908.065201] Stack: >>>>> [35908.065201] ffff88010c4cbb18 0000000000000f37 ffff88020a74c578 >>>>> ffff88015a408000 >>>>> [35908.065201] ffff880154a44000 0000000000000000 0000000000000005 >>>>> ffff88010c4cbbd8 >>>>> [35908.065201] ffffffffa0492b9a 0000000000000005 0000000000000000 >>>>> 0000000000000000 >>>>> [35908.065201] Call Trace: >>>>> [35908.065201] [<ffffffffa0492b9a>] __btrfs_inc_extent_ref+0x8b/0x208 >>>>> [btrfs] >>>>> [35908.065201] [<ffffffffa0497117>] ? >>>>> __btrfs_run_delayed_refs+0x4d4/0xd33 [btrfs] >>>>> [35908.065201] [<ffffffffa049773d>] >>>>> __btrfs_run_delayed_refs+0xafa/0xd33 >>>>> [btrfs] >>>>> [35908.065201] [<ffffffffa04a976a>] ? >>>>> join_transaction.isra.10+0x25/0x41f >>>>> [btrfs] >>>>> [35908.065201] [<ffffffffa04a97ed>] ? >>>>> join_transaction.isra.10+0xa8/0x41f >>>>> [btrfs] >>>>> [35908.065201] [<ffffffffa049914d>] btrfs_run_delayed_refs+0x75/0x1dd >>>>> [btrfs] >>>>> [35908.065201] [<ffffffffa04992f1>] delayed_ref_async_start+0x3c/0x7b >>>>> [btrfs] >>>>> [35908.065201] [<ffffffffa04d4b4f>] normal_work_helper+0x14c/0x32a >>>>> [btrfs] >>>>> [35908.065201] [<ffffffffa04d4e93>] btrfs_extent_refs_helper+0x12/0x14 >>>>> [btrfs] >>>>> [35908.065201] [<ffffffff81063b23>] process_one_work+0x24a/0x4ac >>>>> [35908.065201] [<ffffffff81064285>] worker_thread+0x206/0x2c2 >>>>> [35908.065201] [<ffffffff8106407f>] ? rescuer_thread+0x2cb/0x2cb >>>>> [35908.065201] [<ffffffff8106407f>] ? rescuer_thread+0x2cb/0x2cb >>>>> [35908.065201] [<ffffffff8106904d>] kthread+0xef/0xf7 >>>>> [35908.065201] [<ffffffff81068f5e>] ? kthread_parkme+0x24/0x24 >>>>> [35908.065201] [<ffffffff8147d10f>] ret_from_fork+0x3f/0x70 >>>>> [35908.065201] [<ffffffff81068f5e>] ? kthread_parkme+0x24/0x24 >>>>> [35908.065201] Code: 6a 01 41 56 41 54 ff 75 10 41 51 4d 89 c1 49 89 c8 >>>>> 48 >>>>> 8d 4d d0 e8 f6 f1 ff ff 48 83 c4 28 85 c0 75 2c 49 81 fc ff 00 00 00 77 >>>>> 02 >>>>> <0f> 0b 4c 8b 45 30 8b 4d 28 45 31 >>>>> [35908.065201] RIP [<ffffffffa04928b5>] >>>>> insert_inline_extent_backref+0x52/0xb1 [btrfs] >>>>> [35908.065201] RSP <ffff88010c4cbb08> >>>>> [35908.310885] ---[ end trace fe4299baf0666457 ]--- >>>>> >>>>> This happens because the new delayed references code no longer merges >>>>> delayed references that have different sequence values. The following >>>>> steps are an example sequence leading to this issue: >>>>> >>>>> 1) Transaction N starts, fs_info->tree_mod_seq has value 0; >>>>> >>>>> 2) Extent buffer (btree node) A is allocated, delayed reference Ref1 >>>>> for >>>>> bytenr A is created, with a value of 1 and a seq value of 0; >>>>> >>>>> 3) fs_info->tree_mod_seq is incremented to 1; >>>>> >>>>> 4) Extent buffer A is deleted through btrfs_del_items(), which calls >>>>> btrfs_del_leaf(), which in turn calls btrfs_free_tree_block(). The >>>>> later returns the metadata extent associated to extent buffer A to >>>>> the free space cache (the range is not pinned), because the extent >>>>> buffer was created in the current transaction (N) and writeback >>>>> never >>>>> happened for the extent buffer (flag BTRFS_HEADER_FLAG_WRITTEN not >>>>> set >>>>> in the extent buffer). >>>>> This creates the delayed reference Ref2 for bytenr A, with a value >>>>> of -1 and a seq value of 1; >>>>> >>>>> 5) Delayed reference Ref2 is not merged with Ref1 when we create it, >>>>> because they have different sequence numbers (decided at >>>>> add_delayed_ref_tail_merge()); >>>>> >>>>> 6) fs_info->tree_mod_seq is incremented to 2; >>>>> >>>>> 7) Some task attempts to allocate a new extent buffer (done at >>>>> extent-tree.c:find_free_extent()), but due to heavy fragmentation >>>>> and running low on metadata space the clustered allocation fails >>>>> and we fall back to unclustered allocation, which finds the >>>>> extent at offset A, so a new extent buffer at offset A is >>>>> allocated. >>>>> This creates delayed reference Ref3 for bytenr A, with a value of >>>>> -1 >>>>> and a seq value of 2; >>>>> >>>>> 8) Ref3 is not merged neither with Ref2 nor Ref1, again because they >>>>> all have different seq values; >>>>> >>>>> 9) We start running the delayed references >>>>> (__btrfs_run_delayed_refs()); >>>>> >>>>> 10) The delayed Ref1 is the first one being applied, which ends up >>>>> creating an inline extent backref in the extent tree; >>>>> >>>>> 10) Next the delayed reference Ref3 is selected for execution, and not >>>>> Ref2, because select_delayed_ref() always gives a preference for >>>>> positive references (that have an action of >>>>> BTRFS_ADD_DELAYED_REF); >>>>> >>>>> 11) When running Ref3 we encounter alreay the inline extent backref >>>>> in the extent tree at insert_inline_extent_backref(), which makes >>>>> us hit the following BUG_ON: >>>>> >>>>> BUG_ON(owner < BTRFS_FIRST_FREE_OBJECTID); >>>>> >>>>> This is always true because owner corresponds to the level of the >>>>> extent buffer/btree node in the btree. >>>>> >>>>> For the scenario described above we hit the BUG_ON because we never >>>>> merge >>>>> references that have different seq values. >>>>> >>>>> We used to do the merging before the 4.2 kernel, more specifically, >>>>> before >>>>> the commmits: >>>>> >>>>> c6fc24549960 ("btrfs: delayed-ref: Use list to replace the ref_root >>>>> in >>>>> ref_head.") >>>>> c43d160fcd5e ("btrfs: delayed-ref: Cleanup the unneeded functions.") >>>>> >>>>> This issue became more exposed after the following change that was >>>>> added >>>>> to 4.2 as well: >>>>> >>>>> cffc3374e567 ("Btrfs: fix order by which delayed references are >>>>> run") >>>>> >>>>> Which in turn fixed another regression by the two commits previously >>>>> mentioned. >>>>> >>>>> So fix this by bringing back the delayed reference merge code, with the >>>>> proper adaptations so that it operates against the new data structure >>>>> (linked list vs old red black tree implementation). >>>>> >>>>> This issue was hit running fstest btrfs/063 in a loop. Several people >>>>> have >>>>> reported this issue in the mailing list when running on kernels 4.2+. >>>> >>>> >>>> >>>> Thanks Filipe, >>>> >>>> My fault again. :( >>>> But I'm not completely sure about if tree_mod_seq is still needed now. >>>> >>>> IIRC, with the new qgroup accounting happen at commit_transaction time, >>>> btrfs_find_all_roots() should either searching commit tree for old >>>> roots, or >>>> search current tree for new roots. >>>> No need to search using tree_mod_seq. >>>> >>>> If so, I'd like just allow merging refs without checking tree_mod_seq. >> >> >> And to make it clear, that wouldn't work. If a backref walker starts >> iterating the btrees and then new delayed refs get merged >> independently of the current tree mod seq, the walker will see an >> inconsistent state in the extent tree if the delayed references are >> run (which can happen often before a transaction commit). >> >> So either make delayed references continue using tree mod seq as >> before 4.2, or come with a whole new mechanism that replaces the tree >> mod seq while still giving the same consistency guarantees. > > > I used your patch at https://patchwork.kernel.org/patch/7463161/ to build a > 4.3.0-rc6 kernel. > I mounted my FS with skip_balance, cancelled the paused balance (just to be > sure), then started a new one with : > > # btrfs balance start -dconvert=raid5,soft /tank > > (half of my data blocks are RAID1, the other half is RAID5, the goal is to > be fully RAID5). > > I got a different stacktrace than the usual one, via netconsole, after a few > minutes. It's still referencing btrfs_run_delayed_refs though: > > [ 822.461809] ------------[ cut here ]------------ > [ 822.461833] kernel BUG at fs/btrfs/extent-tree.c:2287! > [ 822.461849] invalid opcode: 0000 [#1] SMP > [ 822.461866] Modules linked in: nfnetlink_queue nfnetlink_log nfnetlink > xt_multiport xt_comment xt_conntrack xt_nat xt_tcpudp xts gf128mul drbg > ansi_cprng btrfs nf_conntrack_ftp nf_conntrack_sane iptable_security > iptable_filter iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 > [ 822.462365] CPU: 1 PID: 3810 Comm: btrfs-transacti Tainted: G W > 4.3.0-rc6p7463161+ #3 > [ 822.462391] Hardware name: ASUS All Series/H87I-PLUS, BIOS 1005 > 01/06/2014 > [ 822.462412] task: ffff88011a4a1a00 ti: ffff8800b6638000 task.ti: > ffff8800b6638000 > [ 822.462434] RIP: 0010:[<ffffffffc032310b>] [<ffffffffc032310b>] > __btrfs_run_delayed_refs.constprop.73+0x108b/0x10d0 [btrfs] > [ 822.462476] RSP: 0018:ffff8800b663bcb0 EFLAGS: 00010202 > [ 822.462495] RAX: 0000000000000001 RBX: ffff8800b3821888 RCX: > ffff8800b3a46cb8 > [ 822.462517] RDX: 0000000000000001 RSI: 00000000000001e1 RDI: > ffff8800b3a46cb0 > [ 822.462543] RBP: ffff8800b663bdb8 R08: 0000000000000000 R09: > ffff8800b3a46cb8 > [ 822.462565] R10: ffff8800b3a46cb8 R11: ffff8800b3a46cb8 R12: > 0000000000000000 > [ 822.462587] R13: 0000000000000000 R14: 00000afefd330000 R15: > ffff8800b3a46c38 > [ 822.462687] FS: 0000000000000000(0000) GS:ffff88011fb00000(0000) > knlGS:0000000000000000 > [ 822.462716] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 822.462736] CR2: 00007ff7a21ca000 CR3: 0000000001c10000 CR4: > 00000000000406e0 > [ 822.462762] Stack: > [ 822.462770] ffffffffc037c8be ffff88011fb169f0 0000000000000001 > 0000000000000001 > [ 822.462802] 0000000000000001 ffff88011a4a1a60 ffff8800b663bd38 > ffff8800b663bd40 > [ 822.462831] ffffffff810aded5 ffffffff8101fcc9 ffff8800b663bd20 > 00000000000014d9 > [ 822.462859] Call Trace: > [ 822.462877] [<ffffffffc037c8be>] ? > try_merge_free_space.isra.26+0x12e/0x180 [btrfs] > [ 822.462902] [<ffffffff810aded5>] ? put_prev_entity+0x35/0x660 > [ 822.462924] [<ffffffff8101fcc9>] ? sched_clock+0x9/0x10 > [ 822.462949] [<ffffffffc0325dcd>] btrfs_run_delayed_refs+0x7d/0x2b0 > [btrfs] > [ 822.462972] [<ffffffff817aa7ab>] ? schedule_timeout+0x16b/0x2a0 > [ 822.462999] [<ffffffffc033a6b3>] btrfs_commit_transaction+0x43/0xb10 > [btrfs] > [ 822.463028] [<ffffffffc0335c19>] transaction_kthread+0x1a9/0x230 [btrfs] > [ 822.463056] [<ffffffffc0335a70>] ? btrfs_cleanup_transaction+0x550/0x550 > [btrfs] > [ 822.463080] [<ffffffff81097099>] kthread+0xc9/0xe0 > [ 822.463096] [<ffffffff81096fd0>] ? kthread_park+0x60/0x60 > [ 822.463116] [<ffffffff817aba8f>] ret_from_fork+0x3f/0x70 > [ 822.463134] [<ffffffff81096fd0>] ? kthread_park+0x60/0x60 > [ 822.463153] Code: c0 48 8b bd 40 ff ff ff 31 c0 e8 31 94 fe ff 0f 0b be > d3 00 00 00 48 c7 c7 3b c6 3b c0 e8 6e 63 d5 c0 e9 64 f9 ff ff 0f 0b 0f 0b > <0f> 0b be d3 00 00 00 48 c7 c7 3b c6 3b c0 e8 52 63 d5 c0 e9 60 > [ 822.463300] RIP [<ffffffffc032310b>] > __btrfs_run_delayed_refs.constprop.73+0x108b/0x10d0 [btrfs] > [ 822.463335] RSP <ffff8800b663bcb0> > [ 822.472131] ---[ end trace f1e21f38cb0ea144 ]--- > > A couple other stacktraces follow after some seconds, then the system dies > completely. sysrqd doesn't even work to reboot it remotely using sysrq > logic. > > fs/btrfs/extent-tree.c:2287 is the line you get from a vanilla 4.3-rc6 + > your patch. I'll post it as soon as I can get somebody to manually reboot > this remote machine (my kernel build machine is the same than the one > hosting the btrfs FS). > > Don't hesitate to ask if you need me to debug or even ftrace something.
Thanks Stéphane. I haven't seen that crash yet (still running tests for 2 consecutive days now). Can you please try the following patch, which works on top of mine, and enable ftrace before running balance: Debug patch: https://friendpaste.com/5s3dItRpcpq3dH1E4KUJor Enable ftrace: $ echo > /sys/kernel/debug/tracing/trace $ echo "nop" > /sys/kernel/debug/tracing/current_tracer $ echo 100000 > /sys/kernel/debug/tracing/buffer_size_kb # if you can use larger buffer size, even better $ echo > /sys/kernel/debug/tracing/set_ftrace_filter $ echo 1 > /sys/kernel/debug/tracing/tracing_on $ run balance... wait until it finishes with IO error or the patch's printk message shows up in dmesg/syslog $ echo 0 > /sys/kernel/debug/tracing/tracing_on $ cat /sys/kernel/debug/tracing/trace > some_file.txt Then send is some_file.txt for debugging, hopefully it will give some useful information. Note that it might produce tons of messages, depending on how long it takes for you to hit the BUG_ON. Thanks a lot for this. > > Thanks, > > -- > Stéphane. > >>>> I was going to do it but not completely sure is there any other user of >>>> tree_mod_seq. >>>> And if it's possible to get rid of tree_mod_seq and merge with last >>>> delayed_ref, things should get cleaner without new codes. >>> >>> >>> Well, the tree mod seq is what allows backref walkers (and possibly >>> other paths) to get a consistent view of all btrees and delayed refs >>> state while doing some processing - that's why we have calls to >>> btrfs_check_delayed_seq() when running delayed references - so that >>> any backref walker will not see changes that happen after it started, >>> i.e. it will see a consistent view of all the btrees (like an >>> in-memory snapshot of all btrees while the transaction is running). >>> >>> I don't think you can get this level of consistency through any other >>> existing means. >>> So just adding back yet more code that was removed despite still being >>> needed. >>> >>> This is affecting way too many people now, I would like to get this >>> fixed and later, if there's a better (new) solution for this, we can >>> get it in. >>> >>> thanks >>> >>>> >>>> Thanks, >>>> Qu >>>> >>>> >>>> >>>>> >>>>> Fixes: c6fc24549960 ("btrfs: delayed-ref: Use list to replace the >>>>> ref_root >>>>> in ref_head.") >>>>> Reported-by: Peter Becker <[email protected]> >>>>> Reported-by: Stéphane Lesimple <[email protected]> >>>>> Reported-by: Malte Schröder <[email protected]> >>>>> Reported-by: Derek Dongray <[email protected]> >>>>> Reported-by: Erkki Seppala <[email protected]> >>>>> Cc: [email protected] # 4.2+ >>>>> Signed-off-by: Filipe Manana <[email protected]> >>>>> --- >>>>> fs/btrfs/delayed-ref.c | 113 >>>>> +++++++++++++++++++++++++++++++++++++++++++++++++ >>>>> fs/btrfs/extent-tree.c | 14 ++++++ >>>>> 2 files changed, 127 insertions(+) >>>>> >>>>> diff --git a/fs/btrfs/delayed-ref.c b/fs/btrfs/delayed-ref.c >>>>> index ac3e81d..4832943 100644 >>>>> --- a/fs/btrfs/delayed-ref.c >>>>> +++ b/fs/btrfs/delayed-ref.c >>>>> @@ -197,6 +197,119 @@ static inline void drop_delayed_ref(struct >>>>> btrfs_trans_handle *trans, >>>>> trans->delayed_ref_updates--; >>>>> } >>>>> >>>>> +static bool merge_ref(struct btrfs_trans_handle *trans, >>>>> + struct btrfs_delayed_ref_root *delayed_refs, >>>>> + struct btrfs_delayed_ref_head *head, >>>>> + struct btrfs_delayed_ref_node *ref, >>>>> + u64 seq) >>>>> +{ >>>>> + struct btrfs_delayed_ref_node *next; >>>>> + bool done = false; >>>>> + >>>>> + next = list_first_entry(&head->ref_list, struct >>>>> btrfs_delayed_ref_node, >>>>> + list); >>>>> + while (!done && &next->list != &head->ref_list) { >>>>> + int mod; >>>>> + struct btrfs_delayed_ref_node *next2; >>>>> + >>>>> + next2 = list_next_entry(next, list); >>>>> + >>>>> + if (next == ref) >>>>> + goto next; >>>>> + >>>>> + if (seq && next->seq >= seq) >>>>> + goto next; >>>>> + >>>>> + if (next->type != ref->type || next->no_quota != >>>>> ref->no_quota) >>>>> + goto next; >>>>> + >>>>> + if ((ref->type == BTRFS_TREE_BLOCK_REF_KEY || >>>>> + ref->type == BTRFS_SHARED_BLOCK_REF_KEY) && >>>>> + comp_tree_refs(btrfs_delayed_node_to_tree_ref(ref), >>>>> + >>>>> btrfs_delayed_node_to_tree_ref(next), >>>>> + ref->type)) >>>>> + goto next; >>>>> + if ((ref->type == BTRFS_EXTENT_DATA_REF_KEY || >>>>> + ref->type == BTRFS_SHARED_DATA_REF_KEY) && >>>>> + comp_data_refs(btrfs_delayed_node_to_data_ref(ref), >>>>> + >>>>> btrfs_delayed_node_to_data_ref(next))) >>>>> + goto next; >>>>> + >>>>> + if (ref->action == next->action) { >>>>> + mod = next->ref_mod; >>>>> + } else { >>>>> + if (ref->ref_mod < next->ref_mod) { >>>>> + swap(ref, next); >>>>> + done = true; >>>>> + } >>>>> + mod = -next->ref_mod; >>>>> + } >>>>> + >>>>> + drop_delayed_ref(trans, delayed_refs, head, next); >>>>> + ref->ref_mod += mod; >>>>> + if (ref->ref_mod == 0) { >>>>> + drop_delayed_ref(trans, delayed_refs, head, >>>>> ref); >>>>> + done = true; >>>>> + } else { >>>>> + /* >>>>> + * Can't have multiples of the same ref on a >>>>> tree >>>>> block. >>>>> + */ >>>>> + WARN_ON(ref->type == BTRFS_TREE_BLOCK_REF_KEY >>>>> || >>>>> + ref->type == >>>>> BTRFS_SHARED_BLOCK_REF_KEY); >>>>> + } >>>>> +next: >>>>> + next = next2; >>>>> + } >>>>> + >>>>> + return done; >>>>> +} >>>>> + >>>>> +void btrfs_merge_delayed_refs(struct btrfs_trans_handle *trans, >>>>> + struct btrfs_fs_info *fs_info, >>>>> + struct btrfs_delayed_ref_root >>>>> *delayed_refs, >>>>> + struct btrfs_delayed_ref_head *head) >>>>> +{ >>>>> + struct btrfs_delayed_ref_node *ref; >>>>> + u64 seq = 0; >>>>> + >>>>> + assert_spin_locked(&head->lock); >>>>> + >>>>> + if (list_empty(&head->ref_list)) >>>>> + return; >>>>> + >>>>> + /* We don't have too many refs to merge for data. */ >>>>> + if (head->is_data) >>>>> + return; >>>>> + >>>>> + spin_lock(&fs_info->tree_mod_seq_lock); >>>>> + if (!list_empty(&fs_info->tree_mod_seq_list)) { >>>>> + struct seq_list *elem; >>>>> + >>>>> + elem = list_first_entry(&fs_info->tree_mod_seq_list, >>>>> + struct seq_list, list); >>>>> + seq = elem->seq; >>>>> + } >>>>> + spin_unlock(&fs_info->tree_mod_seq_lock); >>>>> + >>>>> + ref = list_first_entry(&head->ref_list, struct >>>>> btrfs_delayed_ref_node, >>>>> + list); >>>>> + while (&ref->list != &head->ref_list) { >>>>> + if (seq && ref->seq >= seq) >>>>> + goto next; >>>>> + >>>>> + if (merge_ref(trans, delayed_refs, head, ref, seq)) { >>>>> + if (list_empty(&head->ref_list)) >>>>> + break; >>>>> + ref = list_first_entry(&head->ref_list, >>>>> + struct >>>>> btrfs_delayed_ref_node, >>>>> + list); >>>>> + continue; >>>>> + } >>>>> +next: >>>>> + ref = list_next_entry(ref, list); >>>>> + } >>>>> +} >>>>> + >>>>> int btrfs_check_delayed_seq(struct btrfs_fs_info *fs_info, >>>>> struct btrfs_delayed_ref_root >>>>> *delayed_refs, >>>>> u64 seq) >>>>> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c >>>>> index 522fb45..42d9310 100644 >>>>> --- a/fs/btrfs/extent-tree.c >>>>> +++ b/fs/btrfs/extent-tree.c >>>>> @@ -2433,7 +2433,21 @@ static noinline int >>>>> __btrfs_run_delayed_refs(struct >>>>> btrfs_trans_handle *trans, >>>>> } >>>>> } >>>>> >>>>> + /* >>>>> + * We need to try and merge add/drops of the same ref >>>>> since we >>>>> + * can run into issues with relocate dropping the >>>>> implicit >>>>> ref >>>>> + * and then it being added back again before the drop >>>>> can >>>>> + * finish. If we merged anything we need to re-loop so >>>>> we >>>>> can >>>>> + * get a good ref. >>>>> + * Or we can get node references of the same type that >>>>> weren't >>>>> + * merged when created due to bumps in the tree mod >>>>> seq, >>>>> and >>>>> + * we need to merge them to prevent adding an inline >>>>> extent >>>>> + * backref before dropping it (triggering a BUG_ON at >>>>> + * insert_inline_extent_backref()). >>>>> + */ >>>>> spin_lock(&locked_ref->lock); >>>>> + btrfs_merge_delayed_refs(trans, fs_info, delayed_refs, >>>>> + locked_ref); >>>>> >>>>> /* >>>>> * locked_ref is the head node, so we have to go one >>>>> >>>> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >> the body of a message to [email protected] >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to [email protected] > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html
