Summary...

For Ubuntu Bionic, dpkg triggers for systemd (237-3ubuntu10.39) might
have caused systemd to hang:

[  363.776878]  wait_for_completion+0xba/0x140
[  363.776890]  __flush_work+0x15b/0x210
[  363.776901]  flush_delayed_work+0x41/0x50
[  363.776908]  fsnotify_wait_marks_destroyed+0x15/0x20
[  363.776912]  fsnotify_destroy_group+0x48/0xd0
[  363.776917]  inotify_release+0x1e/0x50
[  363.776923]  __fput+0xea/0x220
[  363.776929]  ____fput+0xe/0x10
[  363.776935]  task_work_run+0x9d/0xc0
[  363.776942]  exit_to_usermode_loop+0xc0/0xd0
[  363.776947]  do_syscall_64+0x121/0x130
[  363.776954]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2

and

[  364.050206]  wait_for_completion+0xba/0x140
[  364.050238]  __synchronize_srcu.part.13+0x85/0xb0
[  364.050248]  synchronize_srcu+0x66/0xe0
[  364.050256]  fsnotify_mark_destroy_workfn+0x7b/0xe0
[  364.050262]  process_one_work+0x1de/0x420
[  364.050267]  worker_thread+0x228/0x410
[  364.050272]  kthread+0x121/0x140

and

[  364.326985]  wait_for_completion+0xba/0x140
[  364.326988]  __synchronize_srcu.part.13+0x85/0xb0
[  364.326993]  synchronize_srcu+0x66/0xe0
[  364.326995]  ? synchronize_srcu+0x66/0xe0
[  364.326996]  fsnotify_connector_destroy_workfn+0x4a/0x80
[  364.326998]  process_one_work+0x1de/0x420
[  364.326999]  worker_thread+0x253/0x410
[  364.327001]  kthread+0x121/0x140

All stack traces seem to come from "fsnotify" subsystem and waiting on
delayed work (completion) for fsnotify marks destruction after a
inotify_release() was called. Completion did not happen for the past 2
minutes. Without a kernel dump it is hard to tell if completion was
still ok - due to kthread being overloaded doing scheduled work and/or
the marks group destruction - or there was a dead lock for the
completion due to a kernel bug.

If this is reproducible, I think that having a kernel dump would help
identifying the issue. I'm letting the kernel team to handle this and
marking all other issues as dealt per previous comments.


** No longer affects: ipmitool (Ubuntu)

** Changed in: maas
       Status: New => Invalid

** Summary changed:

- commissioning fails due to hung tasks setting up ipmitool
+ [Ubuntu][Bionic] systemd caused kernel to hang on fsnotify wait-on-completion

** Also affects: linux (Ubuntu)
   Importance: Undecided
       Status: New

** No longer affects: linux (Ubuntu)

** Project changed: linux => linux (Ubuntu)

** Also affects: linux (Ubuntu Bionic)
   Importance: Undecided
       Status: New

** Also affects: linux (Ubuntu Focal)
   Importance: Undecided
       Status: New

** Also affects: linux (Ubuntu Eoan)
   Importance: Undecided
       Status: New

** Changed in: linux (Ubuntu Bionic)
       Status: New => Triaged

** Changed in: linux (Ubuntu Bionic)
   Importance: Undecided => Medium

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1872021

Title:
  [Ubuntu][Bionic] systemd caused kernel to hang on fsnotify wait-on-
  completion

Status in MAAS:
  Invalid
Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in Bionic:
  Triaged
Status in linux source package in Eoan:
  Incomplete
Status in linux source package in Focal:
  Incomplete

Bug description:
  This is with MAAS 2.7.0, commissioning an HP DL385 G7 with Bionic.
  During the first boot, the machine performs an apt-get upgrade, which 
includes an update of ipmitool (1.8.18-5ubuntu0.1) and freeipmi-tools 
(1.4.11-1.1ubuntu4.1).
  This triggers hung tasks:
  [   66.457048] cloud-init[1534]: Setting up ipmitool (1.8.18-5ubuntu0.1) ... 
           Starting IPMI event daemon...                                        
                                                              
  [  OK  ] Started IPMI event daemon.                                           
                                                              
  [   67.240857] cloud-init[1534]: Setting up freeipmi-tools 
(1.4.11-1.1ubuntu4.1) ...
  [   67.254241] cloud-init[1534]: Processing triggers for systemd 
(237-3ubuntu10.39) ...
  [  242.642684] INFO: task systemd:1 blocked for more than 120 seconds.
  [  242.725654]       Not tainted 4.15.0-96-generic #97-Ubuntu                 
                                                              
  [  242.799835] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [  242.906319] INFO: task kworker/u49:0:6 blocked for more than 120 seconds.  
  
  [  242.997214]       Not tainted 4.15.0-96-generic #97-Ubuntu                 
                                                              
  [  243.072024] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [  247.381896] cloud-init[1534]: Failed to reload daemon: Connection timed 
out  
  [  247.385109] cloud-init[1534]: Processing triggers for man-db 
(2.8.3-2ubuntu0.1) ...
  [  249.828279] cloud-init[1534]: Processing triggers for ureadahead 
(0.100.0-21) ...
  [  249.874840] cloud-init[1534]: Processing triggers for install-info 
(6.5.0.dfsg.1-2) ...
  [  250.160889] cloud-init[1534]: Processing triggers for libc-bin 
(2.27-3ubuntu1) ...                                
  [  363.465849] INFO: task systemd:1 blocked for more than 120 seconds.
  [  363.550072]       Not tainted 4.15.0-96-generic #97-Ubuntu
  [  363.623823] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [  363.728949] INFO: task kworker/u49:0:6 blocked for more than 120 seconds.
  [  363.820481]       Not tainted 4.15.0-96-generic #97-Ubuntu
  [  363.894774] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [  484.291609] INFO: task systemd:1 blocked for more than 120 seconds.
  [  484.381026]       Not tainted 4.15.0-96-generic #97-Ubuntu
  [  484.458451] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [  484.568130] INFO: task kworker/u49:0:6 blocked for more than 120 seconds.
  [  484.661642]       Not tainted 4.15.0-96-generic #97-Ubuntu
  [  484.740371] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [  484.850615] INFO: task udevadm:2665 blocked for more than 120 seconds.
  [  484.942379]       Not tainted 4.15.0-96-generic #97-Ubuntu
  [  485.018318] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [  607.165768] INFO: task systemd:1 blocked for more than 120 seconds.
  [  607.250976]       Not tainted 4.15.0-96-generic #97-Ubuntu
  [  607.325651] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [  607.432497] INFO: task kworker/u49:0:6 blocked for more than 120 seconds.
  [  607.525757]       Not tainted 4.15.0-96-generic #97-Ubuntu
  [  607.599770] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [  607.707273] INFO: task udevadm:2665 blocked for more than 120 seconds.
  [  607.795467]       Not tainted 4.15.0-96-generic #97-Ubuntu
  [  607.869751] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.

  I can perform more tests on this system if needed.

To manage notifications about this bug go to:
https://bugs.launchpad.net/maas/+bug/1872021/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to