On Tue Feb 10, 2026 at 11:15 AM CET, Alice Ryhl wrote: > One way you can see this is by looking at what we require of the > workqueue. For all this to work, it's pretty important that we never > schedule anything on the workqueue that's not signalling safe, since > otherwise you could have a deadlock where the workqueue is executes some > random job calling kmalloc(GFP_KERNEL) and then blocks on our fence, > meaning that the VM_BIND job never gets scheduled since the workqueue > is never freed up. Deadlock.
Yes, I also pointed this out multiple times in the past in the context of C GPU scheduler discussions. It really depends on the workqueue and how it is used. In the C GPU scheduler the driver can pass its own workqueue to the scheduler, which means that the driver has to ensure that at least one out of the wq->max_active works is free for the scheduler to make progress on the scheduler's run and free job work. Or in other words, there must be no more than wq->max_active - 1 works that execute code violating the DMA fence signalling rules. This is also why the JobQ needs its own workqueue and relying on the system WQ is unsound. In case of an ordered workqueue, it is always a potential deadlock to schedule work that does non-atomic allocations or takes a lock that is used elsewhere for non-atomic allocations of course.
