On 18.04.2018 23:47, David Sterba wrote: > On Wed, Apr 18, 2018 at 09:41:54AM +0300, Nikolay Borisov wrote: >> When the delayed refs for a head are all run, eventually >> cleanup_ref_head is called which (in case of deletion) obtains a >> reference for the relevant btrfs_space_info struct by querying the bg >> for the range. This is problematic because when the last extent of a >> bg is deleted a race window emerges between removal of that bg and the >> subsequent invocation of cleanup_ref_head. This can result in cache being >> null >> and either a null pointer dereference or assertion failure. >> >> task: ffff8d04d31ed080 task.stack: ffff9e5dc10cc000 >> RIP: 0010:assfail.constprop.78+0x18/0x1a [btrfs] >> RSP: 0018:ffff9e5dc10cfbe8 EFLAGS: 00010292 >> RAX: 0000000000000044 RBX: 0000000000000000 RCX: 0000000000000000 >> RDX: ffff8d04ffc1f868 RSI: ffff8d04ffc178c8 RDI: ffff8d04ffc178c8 >> RBP: ffff8d04d29e5ea0 R08: 00000000000001f0 R09: 0000000000000001 >> R10: ffff9e5dc0507d58 R11: 0000000000000001 R12: ffff8d04d29e5ea0 >> R13: ffff8d04d29e5f08 R14: ffff8d04efe29b40 R15: ffff8d04efe203e0 >> FS: 00007fbf58ead500(0000) GS:ffff8d04ffc00000(0000) >> knlGS:0000000000000000 >> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> CR2: 00007fe6c6975648 CR3: 0000000013b2a000 CR4: 00000000000006f0 >> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 >> Call Trace: >> __btrfs_run_delayed_refs+0x10e7/0x12c0 [btrfs] >> btrfs_run_delayed_refs+0x68/0x250 [btrfs] >> btrfs_should_end_transaction+0x42/0x60 [btrfs] >> btrfs_truncate_inode_items+0xaac/0xfc0 [btrfs] >> btrfs_evict_inode+0x4c6/0x5c0 [btrfs] >> evict+0xc6/0x190 >> do_unlinkat+0x19c/0x300 >> do_syscall_64+0x74/0x140 >> entry_SYSCALL_64_after_hwframe+0x3d/0xa2 >> RIP: 0033:0x7fbf589c57a7 >> >> To fix this, introduce a new flag "is_system" to head_ref structs, >> which is populated at insertion time. This allows to decouple the >> querying for the spaceinfo from querying the possibly deleted bg. >> >> Fixes: d7eae3403f46 ("Btrfs: rework delayed ref total_bytes_pinned >> accounting") >> Suggested-by: Omar Sandoval <osan...@osandov.com> >> Signed-off-by: Nikolay Borisov <nbori...@suse.com> > > Added to next, thanks. > >> @@ -550,8 +550,10 @@ add_delayed_ref_head(struct btrfs_fs_info *fs_info, >> struct btrfs_delayed_ref_head *head_ref, >> struct btrfs_qgroup_extent_record *qrecord, >> u64 bytenr, u64 num_bytes, u64 ref_root, u64 reserved, >> - int action, int is_data, int *qrecord_inserted_ret, >> + int action, int is_data, int is_system, >> + int *qrecord_inserted_ret, >> int *old_ref_mod, int *new_ref_mod) > > At some point we might want to peel of a few arguments of that function. > It now has 14. The fs_info/trans should work and the > action/is_data/is_system could be merged to a bitmask.
Yes, I sent a series which splits the delayed ref initialisation + addition. It will conflicts with this patch but I intend to rebase. I'm talking about [PATCH 0/8] Delayed refs cleanups/streamlining > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html