On Wed, Apr 29, 2026 at 07:50:31PM +0200, Vasily Gorbik wrote:
> On Tue, Apr 14, 2026 at 12:24:12PM -0700, Paul E. McKenney wrote:
> > On Thu, Apr 09, 2026 at 09:03:26PM -0700, Paul E. McKenney wrote:
> > Please see below for the full patch, including refraining from queueing
> > workqueue handlers on not-yet-online CPUs and diverting SRCU callbacks
> > from not-yet-fully-online CPUs to the boot CPU's callback queue.
> ...
> > commit ce533a60b2ef29a9b516cc717e77c6b679bc09c0
> > Author: Paul E. McKenney <[email protected]>
> > Date:   Thu Apr 9 11:16:02 2026 -0700
> > 
> >     srcu: Don't queue workqueue handlers to never-online CPUs
> >     
> >     While an srcu_struct structure is in the midst of switching from CPU-0
> >     to all-CPUs state, it can attempt to invoke callbacks for CPUs that
> >     have never been online.  Worse yet, it can attempt in invoke callbacks
> >     for CPUs that never will be online due to not being present in the
> >     cpu_possible_mask.  This can cause hangs on s390, which is not set up to
> >     deal with workqueue handlers being scheduled on such CPUs.  This commit
> >     therefore causes Tree SRCU to refrain from queueing workqueue handlers
> >     on CPUs that have not yet (and might never) come online.
> >     
> >     Because callbacks are not invoked on CPUs that have not been
> >     online, it is an error to invoke call_srcu(), synchronize_srcu(), or
> >     synchronize_srcu_expedited() on a CPU that is not yet fully online.
> >     However, it turns out to be less code to redirect the callbacks
> >     from too-early invocations of call_srcu() than to warn about such
> >     invocations.  This commit therefore also redirects callbacks queued on
> >     not-yet-fully-online CPUs to the boot CPU.
> >     
> >     Reported-by: Vasily Gorbik <[email protected]>
> >     Signed-off-by: Paul E. McKenney <[email protected]>
> >     Tested-by: Vasily Gorbik <[email protected]>
> >     Cc: Tejun Heo <[email protected]>
> 
> I retested it on s390 and on x86 KVM with --smp 16,maxcpus=255, all
> looks good to me.
> 
> FWIW, again:
> 
> Tested-by: Vasily Gorbik <[email protected]>
> 
> Would you mind adding Cc: stable so it gets picked up for v7.0?
> 61bbcfb50514 ("srcu: Push srcu_node allocation to GP when
> non-preemptible") is what made it reproducible for us.
> 
> Thank you!

And thank you for testing it, plus apologies for the hassle!

At my next rebase, I will add the following:

Fixes: 61bbcfb50514 ("srcu: Push srcu_node allocation to GP when 
non-preemptible")
Tested-by: Vasily Gorbik <[email protected]>

That should pull it into the needed -stable releases.

Seem reasonable?

                                                        Thanx, Paul

Reply via email to