On Mon, Apr 12, 2021 at 06:22:42PM +0100, Valentin Schneider wrote: > On 12/04/21 14:03, Peter Zijlstra wrote: > > On Thu, Mar 11, 2021 at 03:13:04PM +0000, Valentin Schneider wrote: > >> Peter Zijlstra <[email protected]> writes: > >> > @@ -7910,6 +7908,14 @@ int sched_cpu_deactivate(unsigned int cp > >> > } > >> > rq_unlock_irqrestore(rq, &rf); > >> > > >> > + /* > >> > + * From this point forward, this CPU will refuse to run any > >> > task that > >> > + * is not: migrate_disable() or KTHREAD_IS_PER_CPU, and will > >> > actively > >> > + * push those tasks away until this gets cleared, see > >> > + * sched_cpu_dying(). > >> > + */ > >> > + balance_push_set(cpu, true); > >> > + > >> > >> AIUI with cpu_dying_mask being flipped before even entering > >> sched_cpu_deactivate(), we don't need this to be before the > >> synchronize_rcu() anymore; is there more than that to why you're punting it > >> back this side of it? > > > > I think it does does need to be like this, we need to clearly separate > > the active=true and balance_push_set(). If we were to somehow observe > > both balance_push_set() and active==false, we'd be in trouble. > > > > I'm afraid I don't follow; we're replacing a read of rq->balance_push with > cpu_dying(), and those are still written on the same side of the > synchronize_rcu(). What am I missing?
Yeah, I'm not sure anymnore either; I tried to work out why I'd done that but upon closer examination everything fell flat. Let me try again today :-) > Oooh, I can't read, only the boot CPU gets its callback uninstalled in > sched_init()! So secondaries keep push_callback installed up until > sched_cpu_activate(), but as you said it's not effective unless a rollback > happens. > > Now, doesn't that mean we should *not* uninstall the callback in > sched_cpu_dying()? AFAIK it's possible for the initial secondary CPU > boot to go fine, but the next offline+online cycle fails while going up - > that would need to rollback with push_callback installed. Quite; I removed that shortly after sending this; when I tried to write a comment and found it.

