On Fri, Jun 29, 2018 at 01:55:12PM -0700, Isaac J. Manjarres wrote:
> When cpu_stop_queue_two_works() begins to wake the stopper
> threads, it does so without preemption disabled, which leads
> to the following race condition:
> 
> The source CPU calls cpu_stop_queue_two_works(), with cpu1
> as the source CPU, and cpu2 as the destination CPU. When
> adding the stopper threads to the wake queue used in this
> function, the source CPU stopper thread is added first,
> and the destination CPU stopper thread is added last.
> 
> When wake_up_q() is invoked to wake the stopper threads, the
> threads are woken up in the order that they are queued in,
> so the source CPU's stopper thread is woken up first, and
> it preempts the thread running on the source CPU.
> 
> The stopper thread will then execute on the source CPU,
> disable preemption, and begin executing multi_cpu_stop(),
> and wait for an ack from the destination CPU's stopper thread,
> with preemption still disabled. Since the worker thread that
> woke up the stopper thread on the source CPU is affine to the
> source CPU, and preemption is disabled on the source CPU, that
> thread will never run to dequeue the destination CPU's stopper
> thread from the wake queue, and thus, the destination CPU's
> stopper thread will never run, causing the source CPU's stopper
> thread to wait forever, and stall.
> 
> Disable preemption when waking the stopper threads in
> cpu_stop_queue_two_works() to ensure that the worker thread
> that is waking up the stopper threads isn't preempted
> by the source CPU's stopper thread, and permanently
> scheduled out, leaving the remaining stopper thread asleep
> in the wake queue.
> 
> Co-developed-by: Pavankumar Kondeti <pkond...@codeaurora.org>
> Signed-off-by: Prasad Sodagudi <psoda...@codeaurora.org>
> Signed-off-by: Pavankumar Kondeti <pkond...@codeaurora.org>
> Signed-off-by: Isaac J. Manjarres <isa...@codeaurora.org>

That SoB chain is broken, if Prasad wrote the ptch then there needs to
be a From: line somewhere.

But yes, that looks about right.

> ---
>  kernel/stop_machine.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
> index f89014a..1ff523d 100644
> --- a/kernel/stop_machine.c
> +++ b/kernel/stop_machine.c
> @@ -270,7 +270,11 @@ static int cpu_stop_queue_two_works(int cpu1, struct 
> cpu_stop_work *work1,
>               goto retry;
>       }
>  
> -     wake_up_q(&wakeq);
> +     if (!err) {
> +             preempt_disable();
> +             wake_up_q(&wakeq);
> +             preempt_enable();
> +     }
>  
>       return err;
>  }
> -- 
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> a Linux Foundation Collaborative Project
> 

Reply via email to