----- On Sep 4, 2019, at 12:09 PM, Peter Zijlstra [email protected] wrote:
> On Wed, Sep 04, 2019 at 11:19:00AM -0400, Mathieu Desnoyers wrote: >> ----- On Sep 3, 2019, at 4:36 PM, Linus Torvalds >> [email protected] >> wrote: > >> > I wonder if the easiest model might be to just use a percpu variable >> > instead for the membarrier stuff? It's not like it has to be in >> > 'struct task_struct' at all, I think. We only care about the current >> > runqueues, and those are percpu anyway. >> >> One issue here is that membarrier iterates over all runqueues without >> grabbing any runqueue lock. If we copy that state from mm to rq on >> sched switch prepare, we would need to ensure we have the proper >> memory barriers between: >> >> prior user-space memory accesses / setting the runqueue membarrier state >> >> and >> >> setting the runqueue membarrier state / following user-space memory accesses >> >> Copying the membarrier state into the task struct leverages the fact that >> we have documented and guaranteed those barriers around the rq->curr update >> in the scheduler. > > Should be the same as the barriers we already rely on for rq->curr, no? > That is, if we put this before switch_mm() then we have > smp_mb__after_spinlock() and switch_mm() itself. Yes, I think we can piggy-back on the already documented barriers documented around rq->curr store. > Also, if we place mm->membarrier_state in the same cacheline as mm->pgd > (which switch_mm() is bound to load) then we should be fine, I think. Yes, if we make sure membarrier_prepare_task_switch only updates the rq->membarrier_state if prev->mm != next->mm, we should be able to avoid loading next->mm->membarrier_state when switch_mm() is not invoked. I'll prepare RFC patch implementing this approach. Thanks, Mathieu -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com

