On Fri, Jan 25, 2019 at 11:48:51AM +0000, fdman...@kernel.org wrote: > From: Filipe Manana <fdman...@suse.com> > > When splitting a leaf or node from one of the trees that are modified when > flushing pending block groups (extent, chunk, device and free space trees), > we need to allocate a new tree block, which in turn can result in the need > to allocate a new block group. After allocating the new block group we may > need to flush new block groups that were previously allocated during the > course of the current transaction, which is what may cause a deadlock due > to attempts to write lock twice the same leaf or node, as when splitting > a leaf or node we are holding a write lock on it and its parent node. > > The same type of deadlock can also happen when increasing the tree's > height, since we are holding a lock on the existing root while allocating > the tree block to use as the new root node. > > An example trace when the deadlock happens during the leaf split path is: > > [27175.293054] CPU: 0 PID: 3005 Comm: kworker/u17:6 Tainted: G W > 4.19.16 #1 > [27175.293942] Hardware name: Penguin Computing Relion 1900/MD90-FS0-ZB-XX, > BIOS R15 06/25/2018 > [27175.294846] Workqueue: btrfs-extent-refs btrfs_extent_refs_helper [btrfs] > (...) > [27175.298384] RSP: 0018:ffffab2087107758 EFLAGS: 00010246 > [27175.299269] RAX: 0000000000000bbd RBX: ffff9fadc7141c48 RCX: > 0000000000000001 > [27175.300155] RDX: 0000000000000001 RSI: 0000000000000002 RDI: > ffff9fadc7141c48 > [27175.301023] RBP: 0000000000000001 R08: ffff9faeb6ac1040 R09: > ffff9fa9c0000000 > [27175.301887] R10: 0000000000000000 R11: 0000000000000040 R12: > ffff9fb21aac8000 > [27175.302743] R13: ffff9fb1a64d6a20 R14: 0000000000000001 R15: > ffff9fb1a64d6a18 > [27175.303601] FS: 0000000000000000(0000) GS:ffff9fb21fa00000(0000) > knlGS:0000000000000000 > [27175.304468] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [27175.305339] CR2: 00007fdc8743ead8 CR3: 0000000763e0a006 CR4: > 00000000003606f0 > [27175.306220] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > 0000000000000000 > [27175.307087] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: > 0000000000000400 > [27175.307940] Call Trace: > [27175.308802] btrfs_search_slot+0x779/0x9a0 [btrfs] > [27175.309669] ? update_space_info+0xba/0xe0 [btrfs] > [27175.310534] btrfs_insert_empty_items+0x67/0xc0 [btrfs] > [27175.311397] btrfs_insert_item+0x60/0xd0 [btrfs] > [27175.312253] btrfs_create_pending_block_groups+0xee/0x210 [btrfs] > [27175.313116] do_chunk_alloc+0x25f/0x300 [btrfs] > [27175.313984] find_free_extent+0x706/0x10d0 [btrfs] > [27175.314855] btrfs_reserve_extent+0x9b/0x1d0 [btrfs] > [27175.315707] btrfs_alloc_tree_block+0x100/0x5b0 [btrfs] > [27175.316548] split_leaf+0x130/0x610 [btrfs] > [27175.317390] btrfs_search_slot+0x94d/0x9a0 [btrfs] > [27175.318235] btrfs_insert_empty_items+0x67/0xc0 [btrfs] > [27175.319087] alloc_reserved_file_extent+0x84/0x2c0 [btrfs] > [27175.319938] __btrfs_run_delayed_refs+0x596/0x1150 [btrfs] > [27175.320792] btrfs_run_delayed_refs+0xed/0x1b0 [btrfs] > [27175.321643] delayed_ref_async_start+0x81/0x90 [btrfs] > [27175.322491] normal_work_helper+0xd0/0x320 [btrfs] > [27175.323328] ? move_linked_works+0x6e/0xa0 > [27175.324160] process_one_work+0x191/0x370 > [27175.324976] worker_thread+0x4f/0x3b0 > [27175.325763] kthread+0xf8/0x130 > [27175.326531] ? rescuer_thread+0x320/0x320 > [27175.327284] ? kthread_create_worker_on_cpu+0x50/0x50 > [27175.328027] ret_from_fork+0x35/0x40 > [27175.328741] ---[ end trace 300a1b9f0ac30e26 ]--- > > Fix this by preventing the flushing of new blocks groups when splitting a > leaf/node and when inserting a new root node for one of the trees modified > by the flushing operation, similar to what is done when COWing a node/leaf > from on of these trees. > > Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=202383 > Reported-by: Eli V <eliven...@gmail.com> > Signed-off-by: Filipe Manana <fdman...@suse.com>
Added to 5.0-rc queue, thanks.