On Fri, Oct 12, 2018 at 10:03:55AM +0100, fdman...@kernel.org wrote:
> From: Filipe Manana <fdman...@suse.com>
> 
> When writing out a block group free space cache we can end deadlocking
> with ourseves on an extent buffer lock resulting in a warning like the
> following:
> 
>   [245043.379979] WARNING: CPU: 4 PID: 2608 at fs/btrfs/locking.c:251 
> btrfs_tree_lock+0x1be/0x1d0 [btrfs]
>   [245043.392792] CPU: 4 PID: 2608 Comm: btrfs-transacti Tainted: G
>     W I      4.16.8 #1
>   [245043.395489] RIP: 0010:btrfs_tree_lock+0x1be/0x1d0 [btrfs]
>   [245043.396791] RSP: 0018:ffffc9000424b840 EFLAGS: 00010246
>   [245043.398093] RAX: 0000000000000a30 RBX: ffff8807e20a3d20 RCX: 
> 0000000000000001
>   [245043.399414] RDX: 0000000000000001 RSI: 0000000000000002 RDI: 
> ffff8807e20a3d20
>   [245043.400732] RBP: 0000000000000001 R08: ffff88041f39a700 R09: 
> ffff880000000000
>   [245043.402021] R10: 0000000000000040 R11: ffff8807e20a3d20 R12: 
> ffff8807cb220630
>   [245043.403296] R13: 0000000000000001 R14: ffff8807cb220628 R15: 
> ffff88041fbdf000
>   [245043.404780] FS:  0000000000000000(0000) GS:ffff88082fc80000(0000) 
> knlGS:0000000000000000
>   [245043.406050] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>   [245043.407321] CR2: 00007fffdbdb9f10 CR3: 0000000001c09005 CR4: 
> 00000000000206e0
>   [245043.408670] Call Trace:
>   [245043.409977]  btrfs_search_slot+0x761/0xa60 [btrfs]
>   [245043.411278]  btrfs_insert_empty_items+0x62/0xb0 [btrfs]
>   [245043.412572]  btrfs_insert_item+0x5b/0xc0 [btrfs]
>   [245043.413922]  btrfs_create_pending_block_groups+0xfb/0x1e0 [btrfs]
>   [245043.415216]  do_chunk_alloc+0x1e5/0x2a0 [btrfs]
>   [245043.416487]  find_free_extent+0xcd0/0xf60 [btrfs]
>   [245043.417813]  btrfs_reserve_extent+0x96/0x1e0 [btrfs]
>   [245043.419105]  btrfs_alloc_tree_block+0xfb/0x4a0 [btrfs]
>   [245043.420378]  __btrfs_cow_block+0x127/0x550 [btrfs]
>   [245043.421652]  btrfs_cow_block+0xee/0x190 [btrfs]
>   [245043.422979]  btrfs_search_slot+0x227/0xa60 [btrfs]
>   [245043.424279]  ? btrfs_update_inode_item+0x59/0x100 [btrfs]
>   [245043.425538]  ? iput+0x72/0x1e0
>   [245043.426798]  write_one_cache_group.isra.49+0x20/0x90 [btrfs]
>   [245043.428131]  btrfs_start_dirty_block_groups+0x102/0x420 [btrfs]
>   [245043.429419]  btrfs_commit_transaction+0x11b/0x880 [btrfs]
>   [245043.430712]  ? start_transaction+0x8e/0x410 [btrfs]
>   [245043.432006]  transaction_kthread+0x184/0x1a0 [btrfs]
>   [245043.433341]  kthread+0xf0/0x130
>   [245043.434628]  ? btrfs_cleanup_transaction+0x4e0/0x4e0 [btrfs]
>   [245043.435928]  ? kthread_create_worker_on_cpu+0x40/0x40
>   [245043.437236]  ret_from_fork+0x1f/0x30
>   [245043.441054] ---[ end trace 15abaa2aaf36827f ]---
> 
> This is because at write_one_cache_group() when we are COWing a leaf from
> the extent tree we end up allocating a new block group (chunk) and,
> because we have hit a threshold on the number of bytes reserved for system
> chunks, we attempt to finalize the creation of new block groups from the
> current transaction, by calling btrfs_create_pending_block_groups().
> However here we also need to modify the extent tree in order to insert
> a block group item, and if the location for this new block group item
> happens to be in the same leaf that we were COWing earlier, we deadlock
> since btrfs_search_slot() tries to write lock the extent buffer that we
> locked before at write_one_cache_group().
> 
> We have already hit similar cases in the past and commit d9a0540a79f8
> ("Btrfs: fix deadlock when finalizing block group creation") fixed some
> of those cases by delaying the creation of pending block groups at the
> known specific spots that could lead to a deadlock. This change reworks
> that commit to be more generic so that we don't have to add similar logic
> to every possible path that can lead to a deadlock. This is done by
> making __btrfs_cow_block() disallowing the creation of new block groups
> (setting the transaction's can_flush_pending_bgs to false) before it
> attempts to allocate a new extent buffer for either the extent, chunk or
> device trees, since those are the trees that pending block creation
> modifies. Once the new extent buffer is allocated, it allows creation of
> pending block groups to happen again.
> 
> This change depends on a recent patch from Josef which is not yet in
> Linus' tree, named "btrfs: make sure we create all new block groups" in
> order to avoid occasional warnings at btrfs_trans_release_chunk_metadata().
> 
> Fixes: d9a0540a79f8 ("Btrfs: fix deadlock when finalizing block group 
> creation")
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199753
> Link: 
> https://lore.kernel.org/linux-btrfs/CAJtFHUTHna09ST-_EEiyWmDH6gAqS6wa=zmnmbsifj8abu9...@mail.gmail.com/
> Reported-by: E V <eliven...@gmail.com>
> Signed-off-by: Filipe Manana <fdman...@suse.com>

Reviewed-by: Josef Bacik <jo...@toxicpanda.com>

Thanks,

Josef

Reply via email to