https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59767

torvald at gcc dot gnu.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |torvald at gcc dot gnu.org

--- Comment #8 from torvald at gcc dot gnu.org ---
(In reply to mikulas from comment #7)
> #include <stdatomic.h>
> 
> atomic_int a = ATOMIC_VAR_INIT(0);
> atomic_int b = ATOMIC_VAR_INIT(0);
> atomic_int p = ATOMIC_VAR_INIT(0);
> 
> int thread_1(void)
> {
>         atomic_store_explicit(&b, 1, memory_order_relaxed);
>         atomic_load_explicit(&p, memory_order_seq_cst);
>         return atomic_load_explicit(&a, memory_order_relaxed);
> }
> 
> int thread_2(void)
> {
>         atomic_store_explicit(&a, 1, memory_order_relaxed);
>         atomic_load_explicit(&p, memory_order_seq_cst);
>         return atomic_load_explicit(&b, memory_order_relaxed);
> }
> 
> See for example this. Suppose that thread_1 and thread_2 are executed
> concurrently. If memory_order_seq_cst were a proper full memory barrier, it
> would be impossible that both functions return 0.

memory_order_seq_cst is a memory order in the Standard's terminology.  Fences
are something else (ie, atomic_thread_fence()) , and parametrized by a memory
order.  A memory_order_seq_cst *memory access* does not have the same effects
as a memory_order_seq_cst fence.  See C++14 29.3p4-7; those paragraphs talk
about memory_order_seq_cst fences specifically, not about memory_order_seq_cst
operations in general.

If you want to make this example of Dekker synchronization correct, you need to
use fences instead of the accesses to p; alternatively, you need to use seq-cst
accesses for all the stores and loads to a and b, in which case there will be
HW fences added via the stores (as Andrew already pointed out).

Reply via email to