https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59767
torvald at gcc dot gnu.org changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |torvald at gcc dot gnu.org --- Comment #8 from torvald at gcc dot gnu.org --- (In reply to mikulas from comment #7) > #include <stdatomic.h> > > atomic_int a = ATOMIC_VAR_INIT(0); > atomic_int b = ATOMIC_VAR_INIT(0); > atomic_int p = ATOMIC_VAR_INIT(0); > > int thread_1(void) > { > atomic_store_explicit(&b, 1, memory_order_relaxed); > atomic_load_explicit(&p, memory_order_seq_cst); > return atomic_load_explicit(&a, memory_order_relaxed); > } > > int thread_2(void) > { > atomic_store_explicit(&a, 1, memory_order_relaxed); > atomic_load_explicit(&p, memory_order_seq_cst); > return atomic_load_explicit(&b, memory_order_relaxed); > } > > See for example this. Suppose that thread_1 and thread_2 are executed > concurrently. If memory_order_seq_cst were a proper full memory barrier, it > would be impossible that both functions return 0. memory_order_seq_cst is a memory order in the Standard's terminology. Fences are something else (ie, atomic_thread_fence()) , and parametrized by a memory order. A memory_order_seq_cst *memory access* does not have the same effects as a memory_order_seq_cst fence. See C++14 29.3p4-7; those paragraphs talk about memory_order_seq_cst fences specifically, not about memory_order_seq_cst operations in general. If you want to make this example of Dekker synchronization correct, you need to use fences instead of the accesses to p; alternatively, you need to use seq-cst accesses for all the stores and loads to a and b, in which case there will be HW fences added via the stores (as Andrew already pointed out).