* Tejun Heo <[email protected]> [2026-04-10 08:53:30]: Hi Tejun,
[ copying Samir Mulani to this thread ] > Hello, > > > Seems that we (mostly Paul) have our own trick to track whether a CPU > > has ever been onlined in RCU, see rcu_cpu_beenfullyonline(). Paul also > > used it in his fix [1]. And I think it won't be that hard to copy it > > into workqueue and let queue_work_on() use it so that if the user queues > > a work on a never-onlined CPU, it can detect it (with a warning?) and do > > something? > > The easiest way to do this is just creating the initial workers for all > possible pools. Please see below. However, the downside is that it's going > to create all workers for all possible cpus. This isn't a problem for > anybody else but these IBM mainframes often come up with a lot of possible > but not-yet-or-ever-online CPUs for capacity management, so the cost may not > be negligible on some configurations. > > IBM folks, is that okay? Even on PowerPC LPARS, its not uncommon to have possible cpus != online cpus at boot. However your approach will work. And Samir has already tested the same too and reported here https://lkml.kernel.org/r/[email protected] > > Also, why do you need to queue work items on an offline CPU? Do they > actually have to be per-cpu? Can you get away with using an unbound > workqueue? > > Thanks. > > From: Tejun Heo <[email protected]> > Subject: workqueue: Create workers for all possible CPUs on init > > Per-CPU worker pools are initialized for every possible CPU during early boot, > but workqueue_init() only creates initial workers for online CPUs. On systems > where possible CPUs outnumber online CPUs (e.g. s390 LPARs with 76 online and > 400 possible CPUs), the pools for never-onlined CPUs have POOL_DISASSOCIATED > set but no workers. Any work item queued on such a CPU hangs indefinitely. > > This was exposed by 61bbcfb50514 ("srcu: Push srcu_node allocation to GP when > non-preemptible") which made SRCU schedule callbacks on all possible CPUs > during size transitions, triggering workqueue lockup warnings for all > never-onlined CPUs. > > Create workers for all possible CPUs during init, not just online ones. For > online CPUs, the behavior is unchanged - POOL_DISASSOCIATED is cleared and the > worker is bound to the CPU. For not-yet-online CPUs, POOL_DISASSOCIATED > remains set, so worker_attach_to_pool() marks the worker UNBOUND and it can > execute on any CPU. When the CPU later comes online, rebind_workers() handles > the transition to associated operation as usual. > With these patch, if a CPU has been onlined once, it's should be ok to queue the work on that CPU even if its offline now. > Reported-by: Vasily Gorbik <[email protected]> > Signed-off-by: Tejun Heo <[email protected]> > Cc: Boqun Feng <[email protected]> > Cc: Paul E. McKenney <[email protected]> Reviewed-by: Srikar Dronamraju <[email protected]> > --- > kernel/workqueue.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > --- a/kernel/workqueue.c > +++ b/kernel/workqueue.c > @@ -8068,9 +8068,10 @@ void __init workqueue_init(void) > for_each_bh_worker_pool(pool, cpu) > BUG_ON(!create_worker(pool)); > > - for_each_online_cpu(cpu) { > + for_each_possible_cpu(cpu) { > for_each_cpu_worker_pool(pool, cpu) { > - pool->flags &= ~POOL_DISASSOCIATED; > + if (cpu_online(cpu)) > + pool->flags &= ~POOL_DISASSOCIATED; > BUG_ON(!create_worker(pool)); > } > } > -- > tejun -- Thanks and Regards Srikar Dronamraju

