On Wed, Apr 24, 2019 at 02:36:58PM +0200, Peter Zijlstra wrote: > The comment describing the loongson_llsc_mb() reorder case doesn't > make any sense what so ever. Instruction re-ordering is not an SMP > artifact, but rather a CPU local phenomenon. This means that _every_ > LL/SC loop needs this barrier right in front to avoid the CPU from > leaking a memop inside it. > > For the branch speculation case; if futex_atomic_cmpxchg_inatomic() > needs one at the bne branch target, then surely the normal > __cmpxch_asmg() implementation does too. We cannot rely on the > barriers from cmpxchg() because cmpxchg_local() is implemented with > the same macro, and branch prediction and speculation are, too, CPU > local.
Also; just doing them all makes for much simpler rules and less mistakes.