membarrier_state racy load

Mathieu Desnoyers Tue, 03 Sep 2019 13:53:58 -0700

----- On Sep 3, 2019, at 4:27 PM, Linus Torvalds [email protected] 
wrote:

> On Tue, Sep 3, 2019 at 1:11 PM Mathieu Desnoyers
> <[email protected]> wrote:
>>
>> +       cpus_read_lock();
>> +       for_each_online_cpu(cpu) {
> 
> This would likely be better off using mm_cpumask(mm) instead of all
> online CPU's.

I've considered using mm_cpumask(mm) in the original implementation of
the membarrier expedited private command, and chose to stick to online
cpu mask instead.

Here was my off-list justification to Peter Zijlstra and Paul E. McKenney:

  If we have an iteration on mm_cpumask in the membarrier code,
  then we additionally need to document that memory barriers are
  required before and/or after all updates to the mm_cpumask, otherwise
  I think we end up in the same situation as with the rq->curr update.
  [...]
  So we'd be sprinkling even more memory barrier comments all over.

Considering the amount of comments that needed to be added around the
scheduler rq->curr update for membarrier, I'm concerned that the amount
of additional analysis, documentation, and design constraints required
to safely use mm_cpumask() from membarrier is not really worth it
compared to iterating on online cpus with cpu hotplug read lock held.

> 
> Plus doing the rcu_read_lock() inside the loop seems pointless. Even
> with a lot of cores, it's not going to loop _that_ many times for RCU
> latency to be an issue.

Good point! I'll keep that in mind for next round if we don't chose an
entirely different way forward.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

Re: [RFC PATCH 1/2] Fix: sched/membarrier: p->mm->membarrier_state racy load

Reply via email to