On Fri, Apr 10, 2026 at 08:53:30AM -1000, Tejun Heo wrote:
> Hello,
> 
> On Thu, Apr 09, 2026 at 11:10:04AM -0700, Boqun Feng wrote:
> > On Thu, Apr 09, 2026 at 07:47:09AM -1000, Tejun Heo wrote:
> > > On Thu, Apr 09, 2026 at 10:40:05AM -0700, Boqun Feng wrote:
> > > > On Thu, Apr 09, 2026 at 10:26:49AM -0700, Boqun Feng wrote:
> > > > > On Thu, Apr 09, 2026 at 03:08:45PM +0200, Vasily Gorbik wrote:
> > > > > > Commit 61bbcfb50514 ("srcu: Push srcu_node allocation to GP when
> > > > > > non-preemptible") defers srcu_node tree allocation when called under
> > > > > > raw spinlock, putting SRCU through ~6 transitional grace periods
> > > > > > (SRCU_SIZE_ALLOC to SRCU_SIZE_BIG). During this transition 
> > > > > > srcu_gp_end()
> > > > > > uses mask = ~0, which makes srcu_schedule_cbs_snp() call 
> > > > > > queue_work_on()
> > > > > > for every possible CPU. Since rcu_gp_wq is WQ_PERCPU, work targets
> > > > > > per-CPU pools directly - pools for not-online CPUs have no workers,
> > > > > 
> > > > > [Cc workqueue]
> > > > > 
> > > > > Hmm.. I thought for offline CPUs the corresponding worker pools 
> > > > > become a
> > > > > unbound one hence there are still workers?
> > > > > 
> > > > 
> > > > Ah, as Paul replied in another email, the problem was because these CPUs
> > > > had never been onlined, so they don't even have unbound workers?
> > > 
> > > Hahaha, we do initialize worker pool for every possible CPU but the
> > > transition to unbound operation happens in the hot unplug callback. We
> > 
> > ;-) ;-) ;-)
> > 
> > > probably need to do some of the hot unplug operation during init if the 
> > > CPU
> > 
> > Seems that we (mostly Paul) have our own trick to track whether a CPU
> > has ever been onlined in RCU, see rcu_cpu_beenfullyonline(). Paul also
> > used it in his fix [1]. And I think it won't be that hard to copy it
> > into workqueue and let queue_work_on() use it so that if the user queues
> > a work on a never-onlined CPU, it can detect it (with a warning?) and do
> > something?
> 
> The easiest way to do this is just creating the initial workers for all
> possible pools. Please see below. However, the downside is that it's going
> to create all workers for all possible cpus. This isn't a problem for
> anybody else but these IBM mainframes often come up with a lot of possible
> but not-yet-or-ever-online CPUs for capacity management, so the cost may not
> be negligible on some configurations.
> 
> IBM folks, is that okay?

I have also seen x86 systems whose firmware claimed very large numbers
of CPUs.  :-(

> Also, why do you need to queue work items on an offline CPU? Do they
> actually have to be per-cpu? Can you get away with using an unbound
> workqueue?

It is good for them to run on the specified CPU in the common case for
cache-locality reasons, but if they were occasionally redirected to some
other CPU, that would be just fine.

I am also keeping the patch that avoids queueing work to CPUs that are not
yet fully online.  Further adjustments will be needed if someone invokes
call_srcu(), synchronize_srcu(), or synchronize_srcu_expedited() from an
CPU that is not yet fully online.  Past experience of course suggests that
this will be happen, and that there will be a good reason for it.  ;-)

                                                        Thanx, Paul

> Thanks.
> 
> From: Tejun Heo <[email protected]>
> Subject: workqueue: Create workers for all possible CPUs on init
> 
> Per-CPU worker pools are initialized for every possible CPU during early boot,
> but workqueue_init() only creates initial workers for online CPUs. On systems
> where possible CPUs outnumber online CPUs (e.g. s390 LPARs with 76 online and
> 400 possible CPUs), the pools for never-onlined CPUs have POOL_DISASSOCIATED
> set but no workers. Any work item queued on such a CPU hangs indefinitely.
> 
> This was exposed by 61bbcfb50514 ("srcu: Push srcu_node allocation to GP when
> non-preemptible") which made SRCU schedule callbacks on all possible CPUs
> during size transitions, triggering workqueue lockup warnings for all
> never-onlined CPUs.
> 
> Create workers for all possible CPUs during init, not just online ones. For
> online CPUs, the behavior is unchanged - POOL_DISASSOCIATED is cleared and the
> worker is bound to the CPU. For not-yet-online CPUs, POOL_DISASSOCIATED
> remains set, so worker_attach_to_pool() marks the worker UNBOUND and it can
> execute on any CPU. When the CPU later comes online, rebind_workers() handles
> the transition to associated operation as usual.
> 
> Reported-by: Vasily Gorbik <[email protected]>
> Signed-off-by: Tejun Heo <[email protected]>
> Cc: Boqun Feng <[email protected]>
> Cc: Paul E. McKenney <[email protected]>
> ---
>  kernel/workqueue.c |    5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -8068,9 +8068,10 @@ void __init workqueue_init(void)
>               for_each_bh_worker_pool(pool, cpu)
>                       BUG_ON(!create_worker(pool));
> 
> -     for_each_online_cpu(cpu) {
> +     for_each_possible_cpu(cpu) {
>               for_each_cpu_worker_pool(pool, cpu) {
> -                     pool->flags &= ~POOL_DISASSOCIATED;
> +                     if (cpu_online(cpu))
> +                             pool->flags &= ~POOL_DISASSOCIATED;
>                       BUG_ON(!create_worker(pool));
>               }
>       }
> -- 
> tejun

Reply via email to