On Fri, Jun 29, 2018 at 01:55:12PM -0700, Isaac J. Manjarres wrote: > When cpu_stop_queue_two_works() begins to wake the stopper > threads, it does so without preemption disabled, which leads > to the following race condition: > > The source CPU calls cpu_stop_queue_two_works(), with cpu1 > as the source CPU, and cpu2 as the destination CPU. When > adding the stopper threads to the wake queue used in this > function, the source CPU stopper thread is added first, > and the destination CPU stopper thread is added last. > > When wake_up_q() is invoked to wake the stopper threads, the > threads are woken up in the order that they are queued in, > so the source CPU's stopper thread is woken up first, and > it preempts the thread running on the source CPU. > > The stopper thread will then execute on the source CPU, > disable preemption, and begin executing multi_cpu_stop(), > and wait for an ack from the destination CPU's stopper thread, > with preemption still disabled. Since the worker thread that > woke up the stopper thread on the source CPU is affine to the > source CPU, and preemption is disabled on the source CPU, that > thread will never run to dequeue the destination CPU's stopper > thread from the wake queue, and thus, the destination CPU's > stopper thread will never run, causing the source CPU's stopper > thread to wait forever, and stall. > > Disable preemption when waking the stopper threads in > cpu_stop_queue_two_works() to ensure that the worker thread > that is waking up the stopper threads isn't preempted > by the source CPU's stopper thread, and permanently > scheduled out, leaving the remaining stopper thread asleep > in the wake queue. > > Co-developed-by: Pavankumar Kondeti <pkond...@codeaurora.org> > Signed-off-by: Prasad Sodagudi <psoda...@codeaurora.org> > Signed-off-by: Pavankumar Kondeti <pkond...@codeaurora.org> > Signed-off-by: Isaac J. Manjarres <isa...@codeaurora.org>
That SoB chain is broken, if Prasad wrote the ptch then there needs to be a From: line somewhere. But yes, that looks about right. > --- > kernel/stop_machine.c | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c > index f89014a..1ff523d 100644 > --- a/kernel/stop_machine.c > +++ b/kernel/stop_machine.c > @@ -270,7 +270,11 @@ static int cpu_stop_queue_two_works(int cpu1, struct > cpu_stop_work *work1, > goto retry; > } > > - wake_up_q(&wakeq); > + if (!err) { > + preempt_disable(); > + wake_up_q(&wakeq); > + preempt_enable(); > + } > > return err; > } > -- > The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, > a Linux Foundation Collaborative Project >