On Mon, Jan 12, 2026 at 11:30:52PM +0530, Vishal Chourasia wrote:
> Hello Peter,
>
>
>
> On Mon, Jan 12, 2026 at 03:24:40PM +0100, Peter Zijlstra wrote:
> > On Mon, Jan 12, 2026 at 03:13:33PM +0530, Vishal Chourasia wrote:
> > > Bulk CPU hotplug operations—such as switching SMT modes across all
> > > cores—require hotplugging multiple CPUs in rapid succession. On large
> > > systems, this process takes significant time, increasing as the number
> > > of CPUs grows, leading to substantial delays on high-core-count
> > > machines. Analysis [1] reveals that the majority of this time is spent
> > > waiting for synchronize_rcu().
> > >
> > > Expedite synchronize_rcu() during the hotplug path to accelerate the
> > > operation. Since CPU hotplug is a user-initiated administrative task,
> > > it should complete as quickly as possible.
> > >
> > > Performance data on a PPC64 system with 400 CPUs:
> > >
> > > + ppc64_cpu --smt=1 (SMT8 to SMT1)
> > > Before: real 1m14.792s
> > > After: real 0m03.205s # ~23x improvement
> > >
> > > + ppc64_cpu --smt=8 (SMT1 to SMT8)
> > > Before: real 2m27.695s
> > > After: real 0m02.510s # ~58x improvement
> > >
> >
> > But who cares? Its not like you'd *ever* do this, right?
> Users dynamically adjust SMT modes to optimize performance of the
> workload being run. And, yes it doesn't happen too often, but when it
> does, on machines with (>= 1920 CPUs) it takes more than 20 mins to
> finish.
Users cannot change this, it is root only.
Having to change SMT mode per workload seems quite insane; but whatever.
If you do have to put RCU hooks anywhere; I'd much rather see them in
cpuhp_smt_{en,dis}able(), such that they only affect the batch hotplug
case, rather than everything using cpus_write_lock().
Also note that there is a case to be made to optimize this batch hotplug
case; for one it makes no sense to take cpus_write_lock() over and over
and over again; if you can pull that out, just like it already lifted
cpu_maps_update_begin(), this would help.
And Joel has a point, in that it might make sense for RCU to behave
'better' under these conditions.