On Sun, May 21, 2017 at 02:47:33PM -0700, Marc MERLIN wrote:
> gargamel:~# btrfs check --repair /dev/mapper/dshelf1
> enabling repair mode
> Checking filesystem on /dev/mapper/dshelf1
> UUID: 36f5079e-ca6c-4855-8639-ccb82695c18d
> checking extents
> 
> This causes a bunch of these:
> btrfs-transacti: page allocation stalls for 23508ms, order:0, 
> mode:0x1400840(GFP_NOFS|__GFP_NOFAIL), nodemask=(null)
> btrfs-transacti cpuset=/ mems_allowed=0
> 
> What's the recommended way out of this and which code is at fault? I can't 
> tell if btrfs is doing memory allocations wrong, or if it's just being 
> undermined by the block layer dying underneath.

I went back to 4.8.10, and similar problem.
It looks like btrfs check exercises the kernel and causes everything to come 
down to a halt :(

Sadly, I tried a scrub on the same device, and it stalled after 6TB. The scrub 
process went zombie
and the scrub never succeeded, nor could it be stopped.

What do I try next? My filesystem seems ok when I use it except for that
BUG() crash I just reported a few days ago. I'm willing to believe there is a 
some problem with it somewhere
but if I can't scrub or check it, it's kind of hard to look into it further 

[ 1090.912073] INFO: task kworker/dying:63 blocked for more than 120 seconds.
[ 1090.933850]       Tainted: G     U          
4.8.10-amd64-preempt-sysrq-20161121vb3tj1 #12
[ 1090.959465] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[ 1090.983973] kworker/dying   D ffff9a23424e3d68     0    63      2 0x00000000
[ 1091.006171]  ffff9a23424e3d68 00ff9a213ab32140 ffffffff8dc0d4c0 
ffff9a23424dc100
[ 1091.029349]  ffff9a23424e3d50 ffff9a23424e4000 ffff9a234098d064 
ffff9a23424dc100
[ 1091.052490]  ffff9a234098d068 00000000ffffffff ffff9a23424e3d80 
ffffffff8d6cf1a6
[ 1091.075679] Call Trace:
[ 1091.083882]  [<ffffffff8d6cf1a6>] schedule+0x8b/0xa3
[ 1091.099532]  [<ffffffff8d6cf44b>] schedule_preempt_disabled+0x18/0x24
[ 1091.119518]  [<ffffffff8d6d0841>] __mutex_lock_slowpath+0xce/0x16d
[ 1091.138705]  [<ffffffff8d6d08f7>] mutex_lock+0x17/0x27
[ 1091.154772]  [<ffffffff8d6d08f7>] ? mutex_lock+0x17/0x27
[ 1091.171382]  [<ffffffff8d0da5c6>] acct_process+0x4e/0xe0
[ 1091.187974]  [<ffffffff8d07d401>] ? rescuer_thread+0x24f/0x2d1
[ 1091.206170]  [<ffffffff8d06a5c1>] do_exit+0x3ba/0x97b
[ 1091.222001]  [<ffffffff8d192a8c>] ? kfree+0x7a/0x99
[ 1091.237307]  [<ffffffff8d07d72e>] ? worker_thread+0x2ab/0x2ba
[ 1091.255219]  [<ffffffff8d07d483>] ? rescuer_thread+0x2d1/0x2d1
[ 1091.273390]  [<ffffffff8d081dfb>] kthread+0xbc/0xbc
[ 1091.288672]  [<ffffffff8d6d23df>] ret_from_fork+0x1f/0x40
[ 1091.305524]  [<ffffffff8d081d3f>] ? init_completion+0x24/0x24
[ 1091.323404] INFO: task kworker/u16:4:158 blocked for more than 120 seconds.
[ 1091.344956]       Tainted: G     U          
4.8.10-amd64-preempt-sysrq-20161121vb3tj1 #12
[ 1091.370145] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[ 1091.394299] kworker/u16:4   D ffff9a233e607b58     0   158      2 0x00000000
[ 1091.416260] Workqueue: btrfs-endio-write btrfs_endio_write_helper
[ 1091.435259]  ffff9a233e607b58 00ffffff8d0899ae ffff9a21f0c76180 
ffff9a233e6021c0
[ 1091.458328]  ffff9a233e607b40 ffff9a233e608000 7fffffffffffffff 
ffff9a233e6021c0
[ 1091.481385]  ffffffff8d6d1244 ffff9a2317491e68 ffff9a233e607b70 
ffffffff8d6cf1a6
[ 1091.504472] Call Trace:
[ 1091.512751]  [<ffffffff8d6d1244>] ? usleep_range+0x65/0x65
[ 1091.530093]  [<ffffffff8d6cf1a6>] schedule+0x8b/0xa3
[ 1091.545833]  [<ffffffff8d6d1287>] schedule_timeout+0x43/0x126
[ 1091.563782]  [<ffffffff8d08a9d3>] ? wake_up_process+0x15/0x17
[ 1091.581707]  [<ffffffff8d6cfb45>] do_wait_for_common+0x123/0x15f
[ 1091.600403]  [<ffffffff8d6cfb45>] ? do_wait_for_common+0x123/0x15f
[ 1091.619625]  [<ffffffff8d08aa1c>] ? wake_up_q+0x47/0x47
[ 1091.635983]  [<ffffffff8d6cfc02>] wait_for_common+0x3b/0x55
[ 1091.653380]  [<ffffffff8d6cfc39>] wait_for_completion+0x1d/0x1f
[ 1091.671811]  [<ffffffff8d28d247>] btrfs_async_run_delayed_refs+0xd3/0xed
[ 1091.692598]  [<ffffffff8d2a5dbe>] __btrfs_end_transaction+0x2a7/0x2dd
[ 1091.712585]  [<ffffffff8d2a5e04>] btrfs_end_transaction+0x10/0x12
[ 1091.731529]  [<ffffffff8d2ad5fa>] btrfs_finish_ordered_io+0x3f7/0x4db
[ 1091.751495]  [<ffffffff8d2ad946>] finish_ordered_fn+0x15/0x17
[ 1091.769372]  [<ffffffff8d2cfd68>] btrfs_scrubparity_helper+0x10e/0x258
[ 1091.789590]  [<ffffffff8d2cff40>] btrfs_endio_write_helper+0xe/0x10
[ 1091.809014]  [<ffffffff8d07d068>] process_one_work+0x186/0x29d
[ 1091.827123]  [<ffffffff8d07d66d>] worker_thread+0x1ea/0x2ba
[ 1091.844438]  [<ffffffff8d07d483>] ? rescuer_thread+0x2d1/0x2d1
[ 1091.862521]  [<ffffffff8d081df3>] kthread+0xb4/0xbc
[ 1091.877718]  [<ffffffff8d6d23df>] ret_from_fork+0x1f/0x40
[ 1091.894476]  [<ffffffff8d081d3f>] ? init_completion+0x24/0x24
[ 1091.912276] INFO: task kworker/u16:5:159 blocked for more than 120 seconds.
[ 1091.933740]       Tainted: G     U          
4.8.10-amd64-preempt-sysrq-20161121vb3tj1 #12
[ 1091.958847] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[ 1091.982906] kworker/u16:5   D ffff9a233e60f9c0     0   159      2 0x00000000
[ 1092.004713] Workqueue: btrfs-extent-refs btrfs_extent_refs_helper
[ 1092.023611]  ffff9a233e60f9c0 0000000000000246 ffff9a2342fa41c0 
ffff9a233e608200
[ 1092.046575]  ffff9a233e60f9a8 ffff9a233e610000 ffff9a213898bd88 
ffff9a233052f800
[ 1092.069536]  0000000000000001 0000000000000001 ffff9a233e60f9d8 
ffffffff8d6cf1a6
[ 1092.092523] Call Trace:
[ 1092.100496]  [<ffffffff8d6cf1a6>] schedule+0x8b/0xa3
[ 1092.115995]  [<ffffffff8d2d866d>] btrfs_tree_lock+0xd6/0x1fb
[ 1092.133574]  [<ffffffff8d09af6c>] ? wake_up_atomic_t+0x2c/0x2c
[ 1092.151696]  [<ffffffff8d2847ff>] btrfs_search_slot+0x684/0x6fb
[ 1092.170074]  [<ffffffff8d28a364>] lookup_inline_extent_backref+0xf3/0x45b
[ 1092.191060]  [<ffffffff8d190787>] ? cpu_cache_get.isra.44+0xd/0x1d
[ 1092.210225]  [<ffffffff8d0a12c5>] ? __raw_write_unlock+0x11/0x23
[ 1092.228873]  [<ffffffff8d28b994>] __btrfs_free_extent+0x100/0x9ac
[ 1092.247770]  [<ffffffff8d088000>] ? preempt_latency_start+0x34/0x5d
[ 1092.267183]  [<ffffffff8d2e860f>] ? btrfs_merge_delayed_refs+0x90/0x270
[ 1092.287638]  [<ffffffff8d2900fe>] __btrfs_run_delayed_refs+0xa02/0xd13
[ 1092.307818]  [<ffffffff8d190787>] ? cpu_cache_get.isra.44+0xd/0x1d
[ 1092.326947]  [<ffffffff8d291d92>] btrfs_run_delayed_refs+0x86/0x1dc
[ 1092.346331]  [<ffffffff8d291f42>] delayed_ref_async_start+0x5a/0x99
[ 1092.365732]  [<ffffffff8d2cfd68>] btrfs_scrubparity_helper+0x10e/0x258
[ 1092.385888]  [<ffffffff8d2cfef0>] btrfs_extent_refs_helper+0xe/0x10
[ 1092.405244]  [<ffffffff8d07d068>] process_one_work+0x186/0x29d
[ 1092.423312]  [<ffffffff8d07d66d>] worker_thread+0x1ea/0x2ba
[ 1092.440625]  [<ffffffff8d07d483>] ? rescuer_thread+0x2d1/0x2d1
[ 1092.458723]  [<ffffffff8d081df3>] kthread+0xb4/0xbc
[ 1092.473931]  [<ffffffff8d6d23df>] ret_from_fork+0x1f/0x40
[ 1092.490685]  [<ffffffff8d081d3f>] ? init_completion+0x24/0x24
[ 1092.508525] INFO: task kworker/u16:8:162 blocked for more than 120 seconds.
[ 1092.529991]       Tainted: G     U          
4.8.10-amd64-preempt-sysrq-20161121vb3tj1 #12
[ 1092.555104] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[ 1092.579194] kworker/u16:8   D ffff9a233e623b58     0   162      2 0x00000000
[ 1092.601033] Workqueue: btrfs-endio-write btrfs_endio_write_helper
[ 1092.619956]  ffff9a233e623b58 00ffffff8d0899ae ffff9a2342f80080 
ffff9a233e61a000
[ 1092.642934]  ffff9a233e623b40 ffff9a233e624000 7fffffffffffffff 
ffff9a233e61a000
[ 1092.665923]  ffffffff8d6d1244 ffff9a2326a0d628 ffff9a233e623b70 
ffffffff8d6cf1a6
[ 1092.688901] Call Trace:
[ 1092.696864]  [<ffffffff8d6d1244>] ? usleep_range+0x65/0x65
[ 1092.713931]  [<ffffffff8d6cf1a6>] schedule+0x8b/0xa3
[ 1092.729451]  [<ffffffff8d6d1287>] schedule_timeout+0x43/0x126
[ 1092.747312]  [<ffffffff8d3b7e78>] ? debug_smp_processor_id+0x17/0x19
[ 1092.767021]  [<ffffffff8d08a9d3>] ? wake_up_process+0x15/0x17
[ 1092.784903]  [<ffffffff8d6cfb45>] do_wait_for_common+0x123/0x15f
[ 1092.803533]  [<ffffffff8d6cfb45>] ? do_wait_for_common+0x123/0x15f
[ 1092.822696]  [<ffffffff8d08aa1c>] ? wake_up_q+0x47/0x47
[ 1092.838989]  [<ffffffff8d6cfc02>] wait_for_common+0x3b/0x55
[ 1092.856310]  [<ffffffff8d6cfc39>] wait_for_completion+0x1d/0x1f
[ 1092.874683]  [<ffffffff8d28d247>] btrfs_async_run_delayed_refs+0xd3/0xed
[ 1092.895383]  [<ffffffff8d2a5dbe>] __btrfs_end_transaction+0x2a7/0x2dd
[ 1092.915269]  [<ffffffff8d2a5e04>] btrfs_end_transaction+0x10/0x12
[ 1092.934131]  [<ffffffff8d2ad5fa>] btrfs_finish_ordered_io+0x3f7/0x4db
[ 1092.954047]  [<ffffffff8d088000>] ? preempt_latency_start+0x34/0x5d
[ 1092.973440]  [<ffffffff8d2ad946>] finish_ordered_fn+0x15/0x17
[ 1092.991244]  [<ffffffff8d2cfd68>] btrfs_scrubparity_helper+0x10e/0x258
[ 1093.011384]  [<ffffffff8d07b8c1>] ? pwq_activate_delayed_work+0x4d/0x5b
[ 1093.031800]  [<ffffffff8d2cff40>] btrfs_endio_write_helper+0xe/0x10
[ 1093.051185]  [<ffffffff8d07d068>] process_one_work+0x186/0x29d
[ 1093.069265]  [<ffffffff8d07d66d>] worker_thread+0x1ea/0x2ba
[ 1093.086569]  [<ffffffff8d07d483>] ? rescuer_thread+0x2d1/0x2d1
[ 1093.104632]  [<ffffffff8d081df3>] kthread+0xb4/0xbc
[ 1093.119828]  [<ffffffff8d6d23df>] ret_from_fork+0x1f/0x40
[ 1093.136618]  [<ffffffff8d081d3f>] ? init_completion+0x24/0x24
[ 1093.154411] INFO: task kworker/u16:13:171 blocked for more than 120 seconds.
[ 1093.176141]       Tainted: G     U          
4.8.10-amd64-preempt-sysrq-20161121vb3tj1 #12
[ 1093.201248] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[ 1093.225310] kworker/u16:13  D ffff9a233e653b58     0   171      2 0x00000000
[ 1093.247120] Workqueue: btrfs-endio-write btrfs_endio_write_helper
[ 1093.266059]  ffff9a233e653b58 00ffffff8d0899ae ffff9a2342f8a100 
ffff9a233e64a240
[ 1093.289050]  ffff9a233e653b40 ffff9a233e654000 7fffffffffffffff 
ffff9a233e64a240
[ 1093.312035]  ffffffff8d6d1244 ffff9a220b50ac28 ffff9a233e653b70 
ffffffff8d6cf1a6
[ 1093.335029] Call Trace:
[ 1093.343008]  [<ffffffff8d6d1244>] ? usleep_range+0x65/0x65
[ 1093.360096]  [<ffffffff8d6cf1a6>] schedule+0x8b/0xa3
[ 1093.375601]  [<ffffffff8d6d1287>] schedule_timeout+0x43/0x126
[ 1093.393490]  [<ffffffff8d3b7e78>] ? debug_smp_processor_id+0x17/0x19
[ 1093.413203]  [<ffffffff8d08a9d3>] ? wake_up_process+0x15/0x17
[ 1093.431075]  [<ffffffff8d6cfb45>] do_wait_for_common+0x123/0x15f
[ 1093.449709]  [<ffffffff8d6cfb45>] ? do_wait_for_common+0x123/0x15f
[ 1093.468870]  [<ffffffff8d08aa1c>] ? wake_up_q+0x47/0x47
[ 1093.485154]  [<ffffffff8d6cfc02>] wait_for_common+0x3b/0x55
[ 1093.502461]  [<ffffffff8d6cfc39>] wait_for_completion+0x1d/0x1f
[ 1093.520811]  [<ffffffff8d28d247>] btrfs_async_run_delayed_refs+0xd3/0xed
[ 1093.541485]  [<ffffffff8d2a5dbe>] __btrfs_end_transaction+0x2a7/0x2dd
[ 1093.561398]  [<ffffffff8d2a5e04>] btrfs_end_transaction+0x10/0x12
[ 1093.580241]  [<ffffffff8d2ad5fa>] btrfs_finish_ordered_io+0x3f7/0x4db
[ 1093.600133]  [<ffffffff8d088000>] ? preempt_latency_start+0x34/0x5d
[ 1093.619502]  [<ffffffff8d2ad946>] finish_ordered_fn+0x15/0x17
[ 1093.637313]  [<ffffffff8d2cfd68>] btrfs_scrubparity_helper+0x10e/0x258
[ 1093.657467]  [<ffffffff8d07b8c1>] ? pwq_activate_delayed_work+0x4d/0x5b
[ 1093.677881]  [<ffffffff8d2cff40>] btrfs_endio_write_helper+0xe/0x10
[ 1093.697254]  [<ffffffff8d07d068>] process_one_work+0x186/0x29d
[ 1093.715324]  [<ffffffff8d07d66d>] worker_thread+0x1ea/0x2ba
[ 1093.732624]  [<ffffffff8d07d483>] ? rescuer_thread+0x2d1/0x2d1
[ 1093.750711]  [<ffffffff8d081df3>] kthread+0xb4/0xbc
[ 1093.765925]  [<ffffffff8d6d23df>] ret_from_fork+0x1f/0x40
[ 1093.782718]  [<ffffffff8d081d3f>] ? init_completion+0x24/0x24
[ 1093.800556] INFO: task kworker/u16:14:172 blocked for more than 120 seconds.
[ 1093.847428] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[ 1093.871538] kworker/u16:14  D ffff9a233e65bb58     0   172      2 0x00000000
[ 1093.893373] Workqueue: btrfs-endio-write btrfs_endio_write_helper
[ 1093.912298]  ffff9a233e65bb58 00ffffff8d0899ae ffffffff8dc0d4c0 
ffff9a233e654280
[ 1093.935281]  ffff9a233e65bb40 ffff9a233e65c000 7fffffffffffffff 
ffff9a233e654280
[ 1093.958274]  ffffffff8d6d1244 ffff9a2318e2e928 ffff9a233e65bb70 
ffffffff8d6cf1a6
[ 1093.981288] Call Trace:
[ 1093.989347]  [<ffffffff8d6d1244>] ? usleep_range+0x65/0x65
[ 1094.006469]  [<ffffffff8d6cf1a6>] schedule+0x8b/0xa3
[ 1094.022076]  [<ffffffff8d6d1287>] schedule_timeout+0x43/0x126
[ 1094.039943]  [<ffffffff8d3b7e78>] ? debug_smp_processor_id+0x17/0x19
[ 1094.059628]  [<ffffffff8d08a9d3>] ? wake_up_process+0x15/0x17
[ 1094.077525]  [<ffffffff8d6cfb45>] do_wait_for_common+0x123/0x15f
[ 1094.096215]  [<ffffffff8d6cfb45>] ? do_wait_for_common+0x123/0x15f
[ 1094.115490]  [<ffffffff8d08aa1c>] ? wake_up_q+0x47/0x47
[ 1094.131787]  [<ffffffff8d6cfc02>] wait_for_common+0x3b/0x55
[ 1094.149116]  [<ffffffff8d6cfc39>] wait_for_completion+0x1d/0x1f
[ 1094.167457]  [<ffffffff8d28d247>] btrfs_async_run_delayed_refs+0xd3/0xed
[ 1094.188167]  [<ffffffff8d2a5dbe>] __btrfs_end_transaction+0x2a7/0x2dd
[ 1094.208087]  [<ffffffff8d2a5e04>] btrfs_end_transaction+0x10/0x12
[ 1094.226949]  [<ffffffff8d2ad5fa>] btrfs_finish_ordered_io+0x3f7/0x4db
[ 1094.246856]  [<ffffffff8d088000>] ? preempt_latency_start+0x34/0x5d
[ 1094.266223]  [<ffffffff8d2ad946>] finish_ordered_fn+0x15/0x17
[ 1094.284063]  [<ffffffff8d2cfd68>] btrfs_scrubparity_helper+0x10e/0x258
[ 1094.304354]  [<ffffffff8d07b8c1>] ? pwq_activate_delayed_work+0x4d/0x5b
[ 1094.324809]  [<ffffffff8d2cff40>] btrfs_endio_write_helper+0xe/0x10
[ 1094.344190]  [<ffffffff8d07d068>] process_one_work+0x186/0x29d
[ 1094.362283]  [<ffffffff8d07d66d>] worker_thread+0x1ea/0x2ba
[ 1094.379576]  [<ffffffff8d07d483>] ? rescuer_thread+0x2d1/0x2d1
[ 1094.397667]  [<ffffffff8d081df3>] kthread+0xb4/0xbc
[ 1094.412878]  [<ffffffff8d6d23df>] ret_from_fork+0x1f/0x40
[ 1094.429647]  [<ffffffff8d081d3f>] ? init_completion+0x24/0x24

-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/                         | PGP 1024R/763BE901
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to