Hi Mathieu,

On Fri, Aug 14, 2020 at 12:43:56PM -0400, Mathieu Desnoyers wrote:
> exit_mm should issue memory barriers after user-space memory accesses,
> before clearing current->mm, to order user-space memory accesses
> performed prior to exit_mm before clearing tsk->mm, which has the
> effect of skipping the membarrier private expedited IPIs.
> 
> The membarrier system call can be issued concurrently with do_exit
> if we have thread groups created with CLONE_VM but not CLONE_THREAD.
> 
> Here is the scenario I have in mind:
> 
> Two thread groups are created, A and B. Thread group B is created by
> issuing clone from group A with flag CLONE_VM set, but not CLONE_THREAD.
> Let's assume we have a single thread within each thread group (Thread A
> and Thread B).
> 
> The AFAIU we can have:
> 
> Userspace variables:
> 
> int x = 0, y = 0;
> 
> CPU 0                   CPU 1
> Thread A                Thread B
> (in thread group A)     (in thread group B)
> 
> x = 1
> barrier()
> y = 1
> exit()
> exit_mm()
> current->mm = NULL;
>                         r1 = load y
>                         membarrier()
>                           skips CPU 0 (no IPI) because its current mm is NULL
>                         r2 = load x
>                         BUG_ON(r1 == 1 && r2 == 0)
> 
> Signed-off-by: Mathieu Desnoyers <mathieu.desnoy...@efficios.com>
> Cc: Peter Zijlstra (Intel) <pet...@infradead.org>
> Cc: Will Deacon <w...@kernel.org>
> Cc: Paul E. McKenney <paul...@kernel.org>
> Cc: Nicholas Piggin <npig...@gmail.com>
> Cc: Andy Lutomirski <l...@amacapital.net>
> Cc: Thomas Gleixner <t...@linutronix.de>
> Cc: Linus Torvalds <torva...@linux-foundation.org>
> Cc: Alan Stern <st...@rowland.harvard.edu>
> Cc: linux...@kvack.org
> ---
> Changes since v1:
> - Use smp_mb__after_spinlock rather than smp_mb.
> - Document race scenario in commit message.
> ---
>  kernel/exit.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/kernel/exit.c b/kernel/exit.c
> index 733e80f334e7..fe64e6e28dd5 100644
> --- a/kernel/exit.c
> +++ b/kernel/exit.c
> @@ -475,6 +475,14 @@ static void exit_mm(void)
>       BUG_ON(mm != current->active_mm);
>       /* more a memory barrier than a real lock */
>       task_lock(current);
> +     /*
> +      * When a thread stops operating on an address space, the loop
> +      * in membarrier_{private,global}_expedited() may not observe

Is it accurate to say that the correctness of
membarrier_global_expedited() relies on the observation of ->mm? Because
IIUC membarrier_global_expedited() loop doesn't check ->mm.

Regards,
Boqun

> +      * that tsk->mm, and not issue an IPI. Membarrier requires a
> +      * memory barrier after accessing user-space memory, before
> +      * clearing tsk->mm.
> +      */
> +     smp_mb__after_spinlock();
>       current->mm = NULL;
>       mmap_read_unlock(mm);
>       enter_lazy_tlb(mm, current);
> -- 
> 2.11.0
> 

Reply via email to