On Wed, Apr 29, 2026 at 07:50:31PM +0200, Vasily Gorbik wrote:
> On Tue, Apr 14, 2026 at 12:24:12PM -0700, Paul E. McKenney wrote:
> > On Thu, Apr 09, 2026 at 09:03:26PM -0700, Paul E. McKenney wrote:
> > Please see below for the full patch, including refraining from queueing
> > workqueue handlers on not-yet-online CPUs and diverting SRCU callbacks
> > from not-yet-fully-online CPUs to the boot CPU's callback queue.
> ...
> > commit ce533a60b2ef29a9b516cc717e77c6b679bc09c0
> > Author: Paul E. McKenney <[email protected]>
> > Date: Thu Apr 9 11:16:02 2026 -0700
> >
> > srcu: Don't queue workqueue handlers to never-online CPUs
> >
> > While an srcu_struct structure is in the midst of switching from CPU-0
> > to all-CPUs state, it can attempt to invoke callbacks for CPUs that
> > have never been online. Worse yet, it can attempt in invoke callbacks
> > for CPUs that never will be online due to not being present in the
> > cpu_possible_mask. This can cause hangs on s390, which is not set up to
> > deal with workqueue handlers being scheduled on such CPUs. This commit
> > therefore causes Tree SRCU to refrain from queueing workqueue handlers
> > on CPUs that have not yet (and might never) come online.
> >
> > Because callbacks are not invoked on CPUs that have not been
> > online, it is an error to invoke call_srcu(), synchronize_srcu(), or
> > synchronize_srcu_expedited() on a CPU that is not yet fully online.
> > However, it turns out to be less code to redirect the callbacks
> > from too-early invocations of call_srcu() than to warn about such
> > invocations. This commit therefore also redirects callbacks queued on
> > not-yet-fully-online CPUs to the boot CPU.
> >
> > Reported-by: Vasily Gorbik <[email protected]>
> > Signed-off-by: Paul E. McKenney <[email protected]>
> > Tested-by: Vasily Gorbik <[email protected]>
> > Cc: Tejun Heo <[email protected]>
>
> I retested it on s390 and on x86 KVM with --smp 16,maxcpus=255, all
> looks good to me.
>
> FWIW, again:
>
> Tested-by: Vasily Gorbik <[email protected]>
>
> Would you mind adding Cc: stable so it gets picked up for v7.0?
> 61bbcfb50514 ("srcu: Push srcu_node allocation to GP when
> non-preemptible") is what made it reproducible for us.
>
> Thank you!
And thank you for testing it, plus apologies for the hassle!
At my next rebase, I will add the following:
Fixes: 61bbcfb50514 ("srcu: Push srcu_node allocation to GP when
non-preemptible")
Tested-by: Vasily Gorbik <[email protected]>
That should pull it into the needed -stable releases.
Seem reasonable?
Thanx, Paul