Hey Paul, Lai, Josh, and the RCU list and Jan/FS list -
Reaching out about a tricky hung task issue that I'm running into. I've
got a virtualized Linux guest on top of a KVM based platform, running
a 5.10.y based kernel. The issue we're running into is a hung task that
*only* happens on shutdown/reboot of this particular VM once every
20-50 times.
The signature of the hung task is always similar to the output below,
where we appear to hang on the call to
synchronize_srcu(&fsnotify_mark_srcu)
in fsnotify_connector_destroy_workfn / fsnotify_mark_destroy_workfn,
where two kernel threads are both calling synchronize_srcu, then
scheduling out in wait_for_completion, and completely going out to
lunch for over 4 minutes. This then triggers the hung task timeout and
things blow up.
We are running audit=1 for this system and are using an el8 based
userspace.
I've flipped through the fs/notify code base for both 5.10 as well as
upstream mainline to see if something jumped off the page, and I
haven't yet spotted any particular suspect code from the caller side.
This hang appears to come up at the very end of the shutdown/reboot
process, seemingly after the system starts to unwind through initrd.
What I'm working on now is adding some instrumentation to the dracut
shutdown initrd scripts to see if I can how far we get down that path
before the system fails to make forward progress, which may give some
hints. TBD on that. I've also enabled lockdep with CONFIG_PROVE_RCU and
a plethora of DEBUG options [2], and didn't get anything interesting.
To be clear, we haven't seen lockdep spit out any complaints as of yet.
Reaching out to see if this sounds familar to anyone on the list, or if
there are any particular areas of the RCU code base that might be
suspect for this kind of issue. I'm happy to provide more information,
as frankly, I'm quite stumped at the moment.
Thanks all,
Jon
[1] panic trace
Normal shutdown process, then hangs on the following:
...
dracut Warning: Killing all remaining processes
...
INFO: task kworker/u20:7:1200701 blocked for more than 241 seconds.
Tainted: G O 5.10.205-2.el8.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kworker/u20:7 state:D stack: 0 pid:1200701 ppid: 2
flags:0x00004080
Workqueue: events_unbound fsnotify_connector_destroy_workfn
Call Trace:
__schedule+0x267/0x790
schedule+0x3c/0xb0
schedule_timeout+0x219/0x2b0
wait_for_completion+0x9e/0x100
__synchronize_srcu.part.24+0x83/0xb0
? __bpf_trace_rcu_utilization+0x10/0x10
? synchronize_srcu+0x5d/0xf0
fsnotify_connector_destroy_workfn+0x46/0x80
process_one_work+0x1fc/0x390
worker_thread+0x2d/0x3e0
? process_one_work+0x390/0x390
kthread+0x114/0x130
? kthread_park+0x80/0x80
ret_from_fork+0x1f/0x30
INFO: task kworker/u20:8:1287360 blocked for more than 241 seconds.
Tainted: G O 5.10.205-2.el8.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kworker/u20:8 state:D stack: 0 pid:1287360 ppid: 2
flags:0x00004080
Workqueue: events_unbound fsnotify_mark_destroy_workfn
Call Trace:
__schedule+0x267/0x790
schedule+0x3c/0xb0
schedule_timeout+0x219/0x2b0
? add_timer+0x14a/0x200
wait_for_completion+0x9e/0x100
__synchronize_srcu.part.24+0x83/0xb0
? __bpf_trace_rcu_utilization+0x10/0x10
fsnotify_mark_destroy_workfn+0x77/0xe0
process_one_work+0x1fc/0x390
? process_one_work+0x390/0x390
worker_thread+0x2d/0x3e0
? process_one_work+0x390/0x390
kthread+0x114/0x130
? kthread_park+0x80/0x80
ret_from_fork+0x1f/0x30
Kernel panic - not syncing: hung_task: blocked tasks
CPU: 1 PID: 64 Comm: khungtaskd Kdump: loaded Tainted: G O
5.10.205-2.el8.x86_64 #1
Hardware name: Red Hat KVM, BIOS 20230302.1.2662.el8 04/01/2014
Call Trace:
dump_stack+0x6d/0x8c
panic+0x114/0x2ea
watchdog.cold.8+0xb5/0xb5
? hungtask_pm_notify+0x50/0x50
kthread+0x114/0x130
? kthread_park+0x80/0x80
ret_from_fork+0x1f/0x30
[2] additional debugging config knobs turned up.
CONFIG_PROVE_LOCKING=y
CONFIG_LOCK_STAT=y
CONFIG_DEBUG_RT_MUTEXES=y
CONFIG_DEBUG_SPINLOCK=y
CONFIG_DEBUG_MUTEXES=y
CONFIG_DEBUG_WW_MUTEX_SLOWPATH=y
CONFIG_DEBUG_RWSEMS=y
CONFIG_DEBUG_LOCK_ALLOC=y
CONFIG_LOCKDEP=y
CONFIG_LOCKDEP_BITS=15
CONFIG_LOCKDEP_CHAINS_BITS=16
CONFIG_LOCKDEP_STACK_TRACE_BITS=19
CONFIG_LOCKDEP_STACK_TRACE_HASH_BITS=14
CONFIG_LOCKDEP_CIRCULAR_QUEUE_BITS=12
CONFIG_DEBUG_SHIRQ=y
CONFIG_WQ_WATCHDOG=y
CONFIG_DEBUG_ATOMIC_SLEEP=y
CONFIG_DEBUG_LIST=y
CONFIG_DEBUG_PLIST=y
CONFIG_DEBUG_SG=y
CONFIG_DEBUG_NOTIFIERS=y
CONFIG_BUG_ON_DATA_CORRUPTION=y