----- On Aug 4, 2020, at 10:51 AM, Peter Zijlstra [email protected] wrote:
> On Tue, Jul 28, 2020 at 12:00:10PM -0400, Mathieu Desnoyers wrote: >> Add comments and memory barrier to kthread_use_mm and kthread_unuse_mm >> to allow the effect of membarrier(2) to apply to kthreads accessing >> user-space memory as well. >> >> Given that no prior kthread use this guarantee and that it only affects >> kthreads, adding this guarantee does not affect user-space ABI. >> >> Refine the check in membarrier_global_expedited to exclude runqueues >> running the idle thread rather than all kthreads from the IPI cpumask. >> >> This patch applies on top of this patch from Peter Zijlstra: >> "mm: fix kthread_use_mm() vs TLB invalidate" currently in Andrew >> Morton's tree. >> >> Signed-off-by: Mathieu Desnoyers <[email protected]> >> Cc: Peter Zijlstra (Intel) <[email protected]> >> Cc: Will Deacon <[email protected]> >> Cc: Paul E. McKenney <[email protected]> >> Cc: Nicholas Piggin <[email protected]> >> Cc: Andy Lutomirski <[email protected]> >> Cc: Andrew Morton <[email protected]> >> --- >> kernel/kthread.c | 19 +++++++++++++++++++ >> kernel/sched/membarrier.c | 8 ++------ >> 2 files changed, 21 insertions(+), 6 deletions(-) >> >> diff --git a/kernel/kthread.c b/kernel/kthread.c >> index 48925b17920e..ef2435517f14 100644 >> --- a/kernel/kthread.c >> +++ b/kernel/kthread.c >> @@ -1258,8 +1258,19 @@ void kthread_use_mm(struct mm_struct *mm) >> finish_arch_post_lock_switch(); >> #endif >> >> + /* >> + * When a kthread starts operating on an address space, the loop >> + * in membarrier_{private,global}_expedited() may not observe >> + * that tsk->mm, and not issue an IPI. Membarrier requires a >> + * memory barrier after storing to tsk->mm, before accessing >> + * user-space memory. A full memory barrier for membarrier >> + * {PRIVATE,GLOBAL}_EXPEDITED is implicitly provided by >> + * mmdrop(). >> + */ >> if (active_mm != mm) >> mmdrop(active_mm); >> + else >> + smp_mb(); >> >> to_kthread(tsk)->oldfs = get_fs(); >> set_fs(USER_DS); >> @@ -1280,6 +1291,14 @@ void kthread_unuse_mm(struct mm_struct *mm) >> set_fs(to_kthread(tsk)->oldfs); >> >> task_lock(tsk); >> + /* >> + * When a kthread stops operating on an address space, the loop >> + * in membarrier_{private,global}_expedited() may not observe >> + * that tsk->mm, and not issue an IPI. Membarrier requires a >> + * memory barrier after accessing user-space memory, before >> + * clearing tsk->mm. >> + */ >> + smp_mb(); >> sync_mm_rss(mm); >> local_irq_disable(); > > Would it make sense to put the smp_mb() inside the IRQ disable region? I've initially placed it right after task_lock so we could eventually have a smp_mb__after_non_raw_spinlock or something with a much better naming, which would allow removing the extra barrier when it is implied by the spinlock. I don't see moving the barrier inside the irq off region as having any significant effect as far as membarrier is concern. Is it something you need for tlb flush ? > >> tsk->mm = NULL; >> diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c >> index 168479a7d61b..8a294483074d 100644 >> --- a/kernel/sched/membarrier.c >> +++ b/kernel/sched/membarrier.c >> @@ -100,13 +100,9 @@ static int membarrier_global_expedited(void) >> MEMBARRIER_STATE_GLOBAL_EXPEDITED)) >> continue; >> >> - /* >> - * Skip the CPU if it runs a kernel thread. The scheduler >> - * leaves the prior task mm in place as an optimization when >> - * scheduling a kthread. >> - */ >> + /* Skip the CPU if it runs the idle thread. */ >> p = rcu_dereference(cpu_rq(cpu)->curr); >> - if (p->flags & PF_KTHREAD) >> + if (is_idle_task(p)) >> continue; > > Do we want to add a: > > WARN_ON_ONCE(current->mm); > > in play_idle_precise() ? > > Because, if I read this right, we rely on the idle thread not having an > mm. Yes, that's a good idea. Thanks, Mathieu -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com

