From: Sonam Sanju <[email protected]> On Tue, Mar 31, 2026 at 01:51:00PM -0700, Paul E. McKenney wrote: > On Tue, Mar 31, 2026 at 11:17:19AM -0700, Sean Christopherson wrote: > > Please don't post subsequent versions In-Reply-To previous versions, it > > tends to > > muck up tooling.
Noted, will send future versions as new top-level threads. Sorry about that. > > Unless I'm misunderstanding the bug, "fixing" in this in KVM is papering > > over an > > underlying flaw. Essentially, this would be establishing a rule that > > synchronize_srcu_expedited() can *never* be called while holding a mutex. > > That's > > not viable. > > First, it is OK to invoke synchronize_srcu_expedited() while holding > a mutex. Second, the synchronize_srcu_expedited() function's use of > workqueues is the same as that of synchronize_srcu(), so in an alternate > universe where it was not OK to invoke synchronize_srcu_expedited() while > holding a mutex, it would also not be OK to invoke synchronize_srcu() > while holding that same mutex. Third, it is also OK to acquire that > same mutex within a workqueue handler. Fourth, SRCU and RCU use their > own workqueue, which no one else should be using (and that prohibition > most definitely includes the irqfd workers). Thank you for clarifying this. > As a result, I do have to ask... When you say "multiple irqfd workers", > exactly how many such workers are you running? While running cold reboot/ warm reboot cycling in our Android platforms with 6.18 kernel, the hung_task traces consistently show 8-15 kvm-irqfd-cleanup workers in D state. These are crosvm instances with roughly 10-16 irqfd lines per VM (virtio-blk, virtio-net, virtio-input, virtio-snd, etc., each with a resampler). Vineeth Pillai (Google) reproduced a related scenario under a VM create/destroy stress test where the workqueue reached active=1024 refcnt=2062, though that is a much more extreme case than what we see during normal shutdown. The first part of the deadlock is genuinely there. One worker holds resampler_lock and blocks in synchronize_srcu_expedited() while the remaining 8-15 workers block on __mutex_lock at irqfd_resampler_shutdown. Thanks, Sonam

