Back when Will did his qspinlock determinism patches, we were left with one
cmpxchg loop on x86 due to the use of atomic_fetch_or(). Will proposed a nifty
trick:

  http://lkml.kernel.org/r/20180409145409.ga9...@arm.com

But at the time we didn't pursue it. This series implements that and argues for
its correctness. In particular it places an smp_mb__after_atomic() in
between the two operations, which forces the load to come after the
store (which is free on x86 anyway).

In particular this ordering ensures a concurrent unlock cannot trigger
the uncontended handoff. Also it ensures that if the xchg() happens
after a (successful) trylock, we must observe that LOCKED bit.

Reply via email to