Another instance of the same problem (also on 4.14.67), this time with
the hung task timeout restored to its 120 second default:

[  980.954129] INFO: task systemd:1 blocked for more than 120 seconds.
[  980.954132]       Tainted: G           OE   4.14.67-solidfire1 #1
[  980.954133] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[  980.954135] systemd         D    0     1      0 0x00000000
[  980.954137] Call Trace:
[  980.954150]  ? __schedule+0x27f/0x870
[  980.954154]  ? check_preempt_wakeup+0x102/0x230
[  980.954156]  schedule+0x28/0x80
[  980.954159]  schedule_timeout+0x1e7/0x340
[  980.954163]  ? radix_tree_node_ctor+0x20/0x20
[  980.954165]  ? delete_node+0x1b8/0x1f0
[  980.954167]  ? wait_for_completion+0xb0/0x120
[  980.954169]  wait_for_completion+0xb0/0x120
[  980.954172]  ? wake_up_q+0x70/0x70
[  980.954176]  flush_work+0x10d/0x1c0
[  980.954179]  ? worker_detach_from_pool+0xa0/0xa0
[  980.954183]  fsnotify_destroy_group+0x34/0xa0
[  980.954185]  inotify_release+0x1a/0x50
[  980.954189]  __fput+0xd8/0x220
[  980.954194]  task_work_run+0x8a/0xb0
[  980.954198]  exit_to_usermode_loop+0xb9/0xc0
[  980.954200]  do_syscall_64+0x10b/0x120
[  980.954203]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[  980.954206] RIP: 0033:0x7f031a552900
[  980.954208] RSP: 002b:00007fff93fd18b0 EFLAGS: 00000293 ORIG_RAX:
0000000000000003
[  980.954210] RAX: 0000000000000000 RBX: 0000000000000010 RCX: 00007f031a552900
[  980.954211] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000010
[  980.954212] RBP: 00007f031beba088 R08: 0000000000000020 R09: 000055ac5a21e2ad
[  980.954213] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000000
[  980.954214] R13: 000055ac5a488b20 R14: 000055ac5c0907f0 R15: 0000000000000000
[  980.954724] INFO: task kworker/u114:4:41909 blocked for more than
120 seconds.
[  980.954725]       Tainted: G           OE   4.14.67-solidfire1 #1
[  980.954726] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[  980.954727] kworker/u114:4  D    0 41909      2 0x80000000
[  980.954733] Workqueue: events_unbound fsnotify_mark_destroy_workfn
[  980.954734] Call Trace:
[  980.954739]  ? __schedule+0x27f/0x870
[  980.954741]  schedule+0x28/0x80
[  980.954743]  schedule_timeout+0x1e7/0x340
[  980.954746]  ? __switch_to_asm+0x40/0x70
[  980.954747]  ? update_curr+0xe1/0x1a0
[  980.954750]  ? wait_for_completion+0xb0/0x120
[  980.954751]  wait_for_completion+0xb0/0x120
[  980.954753]  ? wake_up_q+0x70/0x70
[  980.954758]  __synchronize_srcu.part.13+0x76/0x90
[  980.954761]  ? trace_raw_output_rcu_utilization+0x40/0x40
[  980.954762]  ? try_to_wake_up+0x44/0x460
[  980.954765]  ? fsnotify_mark_destroy_workfn+0x67/0xb0
[  980.954767]  fsnotify_mark_destroy_workfn+0x67/0xb0
[  980.954769]  process_one_work+0x1da/0x3d0
[  980.954772]  worker_thread+0x21f/0x3f0
[  980.954775]  ? process_one_work+0x3d0/0x3d0
[  980.954777]  kthread+0x119/0x130
[  980.954779]  ? kthread_create_on_node+0x40/0x40
[  980.954781]  ret_from_fork+0x35/0x40

crash> print reaper_work
$3 = {
  work = {
    data = {
      counter = -117148584927483
    },
    entry = {
      next = 0xffff95743772b770,
      prev = 0xffffa2aa800a3df8
    },
    func = 0xffffffffa8249df0
  },
  timer = {
    entry = {
      next = 0xdead000000000200,
      pprev = 0x0
    },
    expires = 4295020559,
    function = 0xffffffffa80778c0,
    data = 18446744072250514272,
    flags = 69206020
  },
  wq = 0xffff95443f011400,
  cpu = 128
}

Any input is welcome.

-- 
Sebastian Kuzminsky

Reply via email to