Hi Josef,

Thanks for the patch - sorry for the long delay in testing...


On 12/18/2012 06:52 AM, Josef Bacik wrote:
> On Wed, Dec 12, 2012 at 06:52:37PM -0700, Liu Bo wrote:
>> An user reported that he has hit an annoying deadlock while playing with
>> ceph based on btrfs.
>>
>> Current updating device tree requires space from METADATA chunk,
>> so we -may- need to do a recursive chunk allocation when adding/updating
>> dev extent, that is where the deadlock comes from.
>>
>> If we use SYSTEM metadata to update device tree, we can avoid the recursive
>> stuff.
>>
> 
> This is going to cause us to allocate much more system chunks than we used to
> which could land us in trouble.  Instead let's just keep us from re-entering 
> if
> we're already allocating a chunk.  We do the chunk allocation when we don't 
> have
> enough space for a cluster, but we'll likely have plenty of space to make an
> allocation.  Can you give this patch a try Jim and see if it fixes your 
> problem?
> Thanks,
> 
> Josef
> 

With your patch applied to 3.7.1, I get the following on one
of my servers running Ceph OSDs.  The end effect is that some
of my ceph client writes hang. 

[ 1440.335752] ------------[ cut here ]------------
[ 1440.340602] WARNING: at fs/btrfs/super.c:246 
__btrfs_abort_transaction+0x60/0x110 [btrfs]()
[ 1440.349117] Hardware name: X8DTH-i/6/iF/6F
[ 1440.353252] Modules linked in: btrfs zlib_deflate ib_ipoib rdma_ucm ib_ucm 
ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 dm_mirror dm_region_hash 
dm_log dm_round_robin dm_multipath scsi_dh vhost_net macvtap macvlan tun uinput 
sg joydev sd_mod iTCO_wdt iTCO_vendor_support hid_generic button ata_piix 
libata coretemp kvm crc32c_intel ghash_clmulni_intel aesni_intel ablk_helper 
cryptd lrw aes_x86_64 xts gf128mul microcode mpt2sas scsi_transport_sas 
raid_class scsi_mod serio_raw pcspkr mlx4_ib ib_sa ib_mad ib_core mlx4_en 
mlx4_core cxgb4 i2c_i801 i2c_core lpc_ich mfd_core ehci_hcd uhci_hcd ioatdma 
i7core_edac dm_mod edac_core nfsv4 auth_rpcgss nfsv3 nfs_acl nfsv2 nfs lockd 
sunrpc fscache broadcom tg3 hwmon bnx2 igb dca e1000
[ 1440.419398] Pid: 48686, comm: ceph-osd Not tainted 3.7.1-00006-gc794580 #484
[ 1440.426614] Call Trace:
[ 1440.429083]  [<ffffffff8103fed4>] warn_slowpath_common+0x94/0xc0
[ 1440.435110]  [<ffffffff8103ffb6>] warn_slowpath_fmt+0x46/0x50
[ 1440.440894]  [<ffffffffa05425c0>] __btrfs_abort_transaction+0x60/0x110 
[btrfs]
[ 1440.448135]  [<ffffffffa059513d>] __btrfs_alloc_chunk+0x6cd/0x750 [btrfs]
[ 1440.454941]  [<ffffffffa059521e>] btrfs_alloc_chunk+0x5e/0x90 [btrfs]
[ 1440.461382]  [<ffffffffa05543a1>] ? check_system_chunk+0x71/0x130 [btrfs]
[ 1440.468188]  [<ffffffffa055474c>] do_chunk_alloc+0x2ec/0x370 [btrfs]
[ 1440.474562]  [<ffffffffa05509e9>] ? btrfs_reduce_alloc_profile+0xa9/0x120 
[btrfs]
[ 1440.482050]  [<ffffffffa055839c>] btrfs_check_data_free_space+0x13c/0x2b0 
[btrfs]
[ 1440.489558]  [<ffffffffa0559f40>] btrfs_delalloc_reserve_space+0x20/0x60 
[btrfs]
[ 1440.497013]  [<ffffffffa057e31e>] __btrfs_buffered_write+0x15e/0x350 [btrfs]
[ 1440.504095]  [<ffffffffa057e849>] btrfs_file_aio_write+0x209/0x320 [btrfs]
[ 1440.511000]  [<ffffffffa057e640>] ? __btrfs_direct_write+0x130/0x130 [btrfs]
[ 1440.518062]  [<ffffffff81164ef4>] do_sync_readv_writev+0x94/0xe0
[ 1440.524105]  [<ffffffff81165f03>] do_readv_writev+0xe3/0x1e0
[ 1440.529792]  [<ffffffff81182ff2>] ? fget_light+0x122/0x170
[ 1440.535275]  [<ffffffff81166046>] vfs_writev+0x46/0x60
[ 1440.540412]  [<ffffffff8116617f>] sys_writev+0x5f/0xc0
[ 1440.545547]  [<ffffffff81264b3e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[ 1440.551987]  [<ffffffff814b7102>] system_call_fastpath+0x16/0x1b
[ 1440.558016] ---[ end trace 764e83a458dabca6 ]---
[ 1440.562662] BTRFS warning (device dm-32): __btrfs_alloc_chunk:3488: Aborting 
unused transaction(error 28).
[ 1440.595987] BTRFS warning (device dm-32): find_free_extent:5871: Aborting 
unused transaction(Object already exists).
[ 1440.606542] BUG: unable to handle kernel NULL pointer dereference at         
  (null)
[ 1440.614382] IP: [<ffffffffa0584e5e>] map_private_extent_buffer+0xe/0xf0 
[btrfs]
[ 1440.621704] PGD 6138e8067 PUD 56749f067 PMD 0 
[ 1440.626190] Oops: 0000 [#1] SMP 
[ 1440.629442] Modules linked in: btrfs zlib_deflate ib_ipoib rdma_ucm ib_ucm 
ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 dm_mirror dm_region_hash 
dm_log dm_round_robin dm_multipath scsi_dh vhost_net macvtap macvlan tun uinput 
sg joydev sd_mod iTCO_wdt iTCO_vendor_support hid_generic button ata_piix 
libata coretemp kvm crc32c_intel ghash_clmulni_intel aesni_intel ablk_helper 
cryptd lrw aes_x86_64 xts gf128mul microcode mpt2sas scsi_transport_sas 
raid_class scsi_mod serio_raw pcspkr mlx4_ib ib_sa ib_mad ib_core mlx4_en 
mlx4_core cxgb4 i2c_i801 i2c_core lpc_ich mfd_core ehci_hcd uhci_hcd ioatdma 
i7core_edac dm_mod edac_core nfsv4 auth_rpcgss nfsv3 nfs_acl nfsv2 nfs lockd 
sunrpc fscache broadcom tg3 hwmon bnx2 igb dca e1000
[ 1440.694855] CPU 16 
[ 1440.696784] Pid: 48687, comm: ceph-osd Tainted: G        W    
3.7.1-00006-gc794580 #484 Supermicro X8DTH-i/6/iF/6F/X8DTH
[ 1440.707803] RIP: 0010:[<ffffffffa0584e5e>]  [<ffffffffa0584e5e>] 
map_private_extent_buffer+0xe/0xf0 [btrfs]
[ 1440.717544] RSP: 0018:ffff880b740db9f8  EFLAGS: 00010292
[ 1440.722841] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff880b740dba28
[ 1440.729947] RDX: 0000000000000004 RSI: 0000000000000076 RDI: 0000000000000000
[ 1440.737055] RBP: ffff880b740dba08 R08: ffff880b740dba20 R09: ffff880b740dba18
[ 1440.744167] R10: ffff88092bba8000 R11: ffff880a4138c320 R12: 0000000000000000
[ 1440.751280] R13: 0000000000000065 R14: 0000000000000011 R15: 0000000000000076
[ 1440.758395] FS:  00007fffeb4c3700(0000) GS:ffff880627d40000(0000) 
knlGS:0000000000000000
[ 1440.766460] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1440.772188] CR2: 0000000000000000 CR3: 00000004bd2a4000 CR4: 00000000000007e0
[ 1440.779303] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1440.786416] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 1440.793523] Process ceph-osd (pid: 48687, threadinfo ffff880b740da000, task 
ffff8808f801bec0)
[ 1440.802018] Stack:
[ 1440.804030]  ffff880b740dbb98 0000000000000000 ffff880b740dba68 
ffffffffa0581e3c
[ 1440.811464]  ffff880977dbd030 ffff880c00000002 ffff8808f801c5f0 
0000000000000053
[ 1440.818897]  ffff880b740dbae4 ffff880612084c60 0000000000000000 
ffff880612084c60
[ 1440.826330] Call Trace:
[ 1440.828800]  [<ffffffffa0581e3c>] btrfs_get_token_32+0x8c/0xf0 [btrfs]
[ 1440.835327]  [<ffffffffa056042d>] btrfs_match_dir_item_name+0x4d/0x140 
[btrfs]
[ 1440.842545]  [<ffffffffa0560919>] insert_with_overflow+0x59/0x120 [btrfs]
[ 1440.849315]  [<ffffffffa0560ca6>] btrfs_insert_xattr_item+0xb6/0x1d0 [btrfs]
[ 1440.856343]  [<ffffffffa056d279>] ? join_transaction+0x29/0x370 [btrfs]
[ 1440.862945]  [<ffffffffa056d30f>] ? join_transaction+0xbf/0x370 [btrfs]
[ 1440.869536]  [<ffffffff81159ac3>] ? kmem_cache_alloc+0xd3/0x170
[ 1440.875450]  [<ffffffffa0582b3a>] do_setxattr+0x17a/0x240 [btrfs]
[ 1440.881534]  [<ffffffffa0582c8b>] __btrfs_setxattr+0x8b/0x110 [btrfs]
[ 1440.887965]  [<ffffffffa0582f27>] btrfs_setxattr+0xa7/0xc0 [btrfs]
[ 1440.894130]  [<ffffffff8118a19b>] __vfs_setxattr_noperm+0x7b/0x150
[ 1440.900287]  [<ffffffff8118a2fe>] vfs_setxattr+0x8e/0xc0
[ 1440.905591]  [<ffffffff8118a4e5>] setxattr+0x1b5/0x230
[ 1440.910713]  [<ffffffff81167347>] ? __sb_start_write+0x1b7/0x200
[ 1440.916702]  [<ffffffff81185378>] ? mnt_want_write_file+0x28/0x60
[ 1440.922778]  [<ffffffff81182f40>] ? fget_light+0x70/0x170
[ 1440.928168]  [<ffffffff81185378>] ? mnt_want_write_file+0x28/0x60
[ 1440.934242]  [<ffffffff81182ff2>] ? fget_light+0x122/0x170
[ 1440.939713]  [<ffffffff8118a5ec>] sys_fsetxattr+0x8c/0xe0
[ 1440.945097]  [<ffffffff814b7102>] system_call_fastpath+0x16/0x1b
[ 1440.951083] Code: ef 88 00 00 00 48 89 e5 e8 a0 ff ff ff c9 c3 66 66 66 66 
66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 53 48 83 ec 08 66 66 66 66 90 <4c> 8b 
17 41 81 e2 ff 0f 00 00 4a 8d 04 16 4c 8d 5c 10 ff 48 89 
[ 1440.971006] RIP  [<ffffffffa0584e5e>] map_private_extent_buffer+0xe/0xf0 
[btrfs]
[ 1440.978415]  RSP <ffff880b740db9f8>
[ 1440.981896] CR2: 0000000000000000
[ 1440.985557] ---[ end trace 764e83a458dabca7 ]---
[ 1440.990075] divide error: 0000 [#2] SMP 
[ 1440.990133] Modules linked in: btrfs zlib_deflate ib_ipoib rdma_ucm ib_ucm 
ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 dm_mirror dm_region_hash 
dm_log dm_round_robin dm_multipath scsi_dh vhost_net macvtap macvlan tun uinput 
sg joydev sd_mod iTCO_wdt iTCO_vendor_support hid_generic button ata_piix 
libata coretemp kvm crc32c_intel ghash_clmulni_intel aesni_intel ablk_helper 
cryptd lrw aes_x86_64 xts gf128mul microcode mpt2sas scsi_transport_sas 
raid_class scsi_mod serio_raw pcspkr mlx4_ib ib_sa ib_mad ib_core mlx4_en 
mlx4_core cxgb4 i2c_i801 i2c_core lpc_ich mfd_core ehci_hcd uhci_hcd ioatdma 
i7core_edac dm_mod edac_core nfsv4 auth_rpcgss nfsv3 nfs_acl nfsv2 nfs lockd 
sunrpc fscache broadcom tg3 hwmon bnx2 igb dca e1000
[ 1440.990139] CPU 20 
[ 1440.990139] Pid: 48693, comm: ceph-osd Tainted: G      D W    
3.7.1-00006-gc794580 #484 Supermicro X8DTH-i/6/iF/6F/X8DTH
[ 1440.990163] RIP: 0010:[<ffffffffa059429d>]  [<ffffffffa059429d>] 
__btrfs_map_block+0xcd/0x670 [btrfs]
[ 1440.990187] RSP: 0018:ffff880b740f5ad8  EFLAGS: 00010246
[ 1440.990194] RAX: 0000000000800000 RBX: 0000000000800000 RCX: 0000000040000000
[ 1440.990195] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[ 1440.990195] RBP: ffff880b740f5b68 R08: 0000000000000000 R09: 0000000000000000
[ 1440.990196] R10: ffff88062311f6e8 R11: 0000000000000000 R12: ffff880b740f5b90
[ 1440.990200] R13: ffff8805054971c0 R14: ffff880c182f4298 R15: ffff880b740f5e68
[ 1440.990201] FS:  00007fffe6cba700(0000) GS:ffff880c3fd00000(0000) 
knlGS:0000000000000000
[ 1440.990202] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1440.990203] CR2: ffffffffff600400 CR3: 00000004bd2a4000 CR4: 00000000000007e0
[ 1440.990207] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1440.990207] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 1440.990209] Process ceph-osd (pid: 48693, threadinfo ffff880b740f4000, task 
ffff8809877d8000)
[ 1440.990209] Stack:
[ 1440.990217]  ffff88092bba8000 ffff880156a22e00 ffff88062311f6e8 
ffff880156a23388
[ 1440.990225]  0000000000000000 ffffffff8111365d 0000000000000000 
0000000000000000
[ 1440.990230]  00000000740f5b98 0000000000000046 0000000000000000 
ffffffff8111365d
[ 1440.990230] Call Trace:
[ 1440.990236]  [<ffffffff8111365d>] ? test_set_page_writeback+0x6d/0x170
[ 1440.990291]  [<ffffffff8111365d>] ? test_set_page_writeback+0x6d/0x170
[ 1440.990307]  [<ffffffffa059484e>] btrfs_map_block+0xe/0x10 [btrfs]
[ 1440.990349]  [<ffffffffa0571307>] btrfs_merge_bio_hook+0x57/0x80 [btrfs]
[ 1440.990458]  [<ffffffffa0585ba3>] submit_extent_page+0xc3/0x1d0 [btrfs]
[ 1440.990487]  [<ffffffff8110a2f0>] ? find_get_pages+0x1c0/0x1c0
[ 1440.990525]  [<ffffffffa058ba7f>] __extent_writepage+0x69f/0x760 [btrfs]
[ 1440.990571]  [<ffffffffa0585ed0>] ? extent_io_tree_init+0x90/0x90 [btrfs]
[ 1440.990680]  [<ffffffffa058bf52>] 
extent_write_cache_pages.clone.3+0x242/0x3d0 [btrfs]
[ 1440.990733]  [<ffffffffa058c12f>] extent_writepages+0x4f/0x70 [btrfs]
[ 1440.990784]  [<ffffffffa0577630>] ? btrfs_lookup+0x70/0x70 [btrfs]
[ 1440.990848]  [<ffffffff81182ff2>] ? fget_light+0x122/0x170
[ 1440.990870]  [<ffffffffa0571df7>] btrfs_writepages+0x27/0x30 [btrfs]
[ 1440.990886]  [<ffffffff81115423>] do_writepages+0x23/0x40
[ 1440.990889]  [<ffffffff811099ce>] __filemap_fdatawrite_range+0x4e/0x50
[ 1440.990920]  [<ffffffff81109c83>] filemap_fdatawrite_range+0x13/0x20
[ 1440.990982]  [<ffffffff81195589>] sys_sync_file_range+0x109/0x170
[ 1440.991022]  [<ffffffff814b7102>] system_call_fastpath+0x16/0x1b
[ 1440.991149] Code: 66 0f 1f 44 00 00 4d 8b 6a 60 48 29 c3 8b 45 c4 41 39 45 
18 b8 00 00 00 00 0f 4d 45 c4 31 d2 89 45 c4 49 63 75 10 48 89 d8 89 f7 <48> f7 
f7 49 89 c6 48 89 45 c8 4c 0f af f6 4c 39 f3 73 10 0f 0b 
[ 1440.991174] RIP  [<ffffffffa059429d>] __btrfs_map_block+0xcd/0x670 [btrfs]
[ 1440.991203]  RSP <ffff880b740f5ad8>
[ 1440.991206] ---[ end trace 764e83a458dabca8 ]---
[ 1451.948155] BUG: unable to handle kernel NULL pointer dereference at 
00000000000000a9
[ 1451.956010] IP: [<ffffffffa05949d4>] btrfs_map_bio+0x184/0x220 [btrfs]
[ 1451.962580] PGD 0 
[ 1451.964620] Oops: 0000 [#3] SMP 
[ 1451.967887] Modules linked in: btrfs zlib_deflate ib_ipoib rdma_ucm ib_ucm 
ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 dm_mirror dm_region_hash 
dm_log dm_round_robin dm_multipath scsi_dh vhost_net macvtap macvlan tun uinput 
sg joydev sd_mod iTCO_wdt iTCO_vendor_support hid_generic button ata_piix 
libata coretemp kvm crc32c_intel ghash_clmulni_intel aesni_intel ablk_helper 
cryptd lrw aes_x86_64 xts gf128mul microcode mpt2sas scsi_transport_sas 
raid_class scsi_mod serio_raw pcspkr mlx4_ib ib_sa ib_mad ib_core mlx4_en 
mlx4_core cxgb4 i2c_i801 i2c_core lpc_ich mfd_core ehci_hcd uhci_hcd ioatdma 
i7core_edac dm_mod edac_core nfsv4 auth_rpcgss nfsv3 nfs_acl nfsv2 nfs lockd 
sunrpc fscache broadcom tg3 hwmon bnx2 igb dca e1000
[ 1452.033336] CPU 5 
[ 1452.035177] Pid: 25627, comm: btrfs-worker-1 Tainted: G      D W    
3.7.1-00006-gc794580 #484 Supermicro X8DTH-i/6/iF/6F/X8DTH
[ 1452.046715] RIP: 0010:[<ffffffffa05949d4>]  [<ffffffffa05949d4>] 
btrfs_map_bio+0x184/0x220 [btrfs]
[ 1452.055688] RSP: 0018:ffff88050e967cc8  EFLAGS: 00010202
[ 1452.060987] RAX: 000000000000000c RBX: ffff880959c9ea80 RCX: ffff880959c9ea80
[ 1452.068100] RDX: ffff88060bd03060 RSI: 0000000000000001 RDI: ffff88062311f6e8
[ 1452.075212] RBP: ffff88050e967d28 R08: ffff88060bd03060 R09: 0000000000000009
[ 1452.082327] R10: ffff88062311f6e8 R11: 0000000000000000 R12: 0000000000000001
[ 1452.089442] R13: 0000000000000000 R14: 0000000000000004 R15: ffff88092bba8000
[ 1452.096554] FS:  0000000000000000(0000) GS:ffff880627ca0000(0000) 
knlGS:0000000000000000
[ 1452.104621] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1452.110352] CR2: 00000000000000a9 CR3: 0000000001a0b000 CR4: 00000000000007e0
[ 1452.117466] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1452.124577] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 1452.131693] Process btrfs-worker-1 (pid: 25627, threadinfo ffff88050e966000, 
task ffff880612418000)
[ 1452.140707] Stack:
[ 1452.142720]  0000000000000000 000000000040e010 00000001182f5470 
0000000100000000
[ 1452.150160]  ffff88060bd03060 000000003f7fe000 ffff88050e967d38 
ffff880959c9e7c8
[ 1452.157601]  ffff880959c9e780 ffff880c182f5470 ffff880c182f5428 
ffff880c182f5418
[ 1452.165061] Call Trace:
[ 1452.167540]  [<ffffffffa0570bab>] __btrfs_submit_bio_done+0x1b/0x20 [btrfs]
[ 1452.174501]  [<ffffffffa0566a41>] run_one_async_done+0xc1/0xd0 [btrfs]
[ 1452.181027]  [<ffffffffa0596a93>] run_ordered_completions+0x83/0xd0 [btrfs]
[ 1452.187991]  [<ffffffffa05975c8>] worker_loop+0x1b8/0x410 [btrfs]
[ 1452.194087]  [<ffffffffa0597410>] ? check_pending_worker_creates+0xe0/0xe0 
[btrfs]
[ 1452.201639]  [<ffffffff81066df1>] kthread+0xe1/0xf0
[ 1452.206528]  [<ffffffff81066d10>] ? __init_kthread_worker+0x70/0x70
[ 1452.212779]  [<ffffffff814b705c>] ret_from_fork+0x7c/0xb0
[ 1452.218167]  [<ffffffff81066d10>] ? __init_kthread_worker+0x70/0x70
[ 1452.224411] Code: 48 89 51 48 48 8d 14 40 48 8b 45 c0 48 c1 e2 03 48 01 d0 
48 8b 40 38 48 c1 e8 09 48 89 01 48 03 55 c0 48 8b 72 30 48 85 f6 74 4c <48> 8b 
86 a8 00 00 00 48 85 c0 74 40 41 83 fc 01 75 0a 8b 56 60 
[ 1452.244357] RIP  [<ffffffffa05949d4>] btrfs_map_bio+0x184/0x220 [btrfs]
[ 1452.250995]  RSP <ffff88050e967cc8>
[ 1452.254485] CR2: 00000000000000a9
[ 1452.258149] ---[ end trace 764e83a458dabca9 ]---


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to