https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59767

--- Comment #7 from mikulas at artax dot karlin.mff.cuni.cz ---
#include <stdatomic.h>

atomic_int a = ATOMIC_VAR_INIT(0);
atomic_int b = ATOMIC_VAR_INIT(0);
atomic_int p = ATOMIC_VAR_INIT(0);

int thread_1(void)
{
        atomic_store_explicit(&b, 1, memory_order_relaxed);
        atomic_load_explicit(&p, memory_order_seq_cst);
        return atomic_load_explicit(&a, memory_order_relaxed);
}

int thread_2(void)
{
        atomic_store_explicit(&a, 1, memory_order_relaxed);
        atomic_load_explicit(&p, memory_order_seq_cst);
        return atomic_load_explicit(&b, memory_order_relaxed);
}

See for example this. Suppose that thread_1 and thread_2 are executed
concurrently. If memory_order_seq_cst were a proper full memory barrier, it
would be impossible that both functions return 0. Because you omit the barrier
on read of variable p, it is possible that both functions return 0.

thread_1 is compiled into
        movl    $1, b(%rip)
        movl    p(%rip), %eax
        movl    a(%rip), %eax
        ret
thread_2 is compiled into
        movl    $1, a(%rip)
        movl    p(%rip), %eax
        movl    b(%rip), %eax
        ret
... and the processor is free to move the writes past reads, resulting in both
functions returning zero.

Does the standard allow this behavior? I don't really know. I don't understand
the standard. Please tell me - how do you decide, by interpreting claims in the
section 7.17.3 of the C11 standard, whether the above outcome is allowed or
not?

Reply via email to