On Thu, Jun 13, 2019 at 03:43:21PM +0200, Peter Zijlstra wrote: > Recent probing at the Linux Kernel Memory Model uncovered a > 'surprise'. Strongly ordered architectures where the atomic RmW > primitive implies full memory ordering and > smp_mb__{before,after}_atomic() are a simple barrier() (such as x86) > fail for: > > *x = 1; > atomic_inc(u); > smp_mb__after_atomic(); > r0 = *y; > > Because, while the atomic_inc() implies memory order, it > (surprisingly) does not provide a compiler barrier. This then allows > the compiler to re-order like so: > > atomic_inc(u); > *x = 1; > smp_mb__after_atomic(); > r0 = *y; > > Which the CPU is then allowed to re-order (under TSO rules) like: > > atomic_inc(u); > r0 = *y; > *x = 1; > > And this very much was not intended. Therefore strengthen the atomic > RmW ops to include a compiler barrier. > > NOTE: atomic_{or,and,xor} and the bitops already had the compiler > barrier. > > Reported-by: Andrea Parri <andrea.pa...@amarulasolutions.com> > Signed-off-by: Peter Zijlstra (Intel) <pet...@infradead.org> > --- > Documentation/atomic_t.txt | 3 +++ > arch/x86/include/asm/atomic.h | 8 ++++---- > arch/x86/include/asm/atomic64_64.h | 8 ++++---- > arch/x86/include/asm/barrier.h | 4 ++-- > 4 files changed, 13 insertions(+), 10 deletions(-) > > --- a/Documentation/atomic_t.txt > +++ b/Documentation/atomic_t.txt > @@ -194,6 +194,9 @@ These helper barriers exist because arch > ordering on their SMP atomic primitives. For example our TSO architectures > provide full ordered atomics and these barriers are no-ops. > > +NOTE: when the atomic RmW ops are fully ordered, they should also imply a > +compiler barrier.
Acked-by: Will Deacon <will.dea...@arm.com> Cheers, Will