https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115802
Bug ID: 115802 Summary: Non-atomic load of static variable moved out of loop despite atomic fences Product: gcc Version: 14.1.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: securesneakers at gmail dot com Target Milestone: --- Created attachment 58595 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58595&action=edit Minimal program to reproduce the issue The attachment program contains: - A toy mutex (spinlock) - A toy condition variable - A non-atomic _static_ flag, protected by the mutex - A reader thread that waits for the flag to be set using mutex and condition variable - A writer thread that sets the flag and notifies the condition variable. The program can be compiled and run, but will hang: $ gcc -std=c11 -O2 minimal-executable.c $ ./a.out Because the waiting loop: mutex_lock(&mtx); while (!val) cond_wait(&cnd, &mtx); mutex_unlock(&mtx); Has been optimized into infinite loop: $ objdump -d a.out ... 10b0: e8 7b 01 00 00 call 1230 <cond_wait.constprop.0> 10b5: eb f9 jmp 10b0 <main+0x50> Such transformation means that non-atomic load of "val" has been moved before "memory_order_seq_cst" load of the "mutex_lock" function. Making the flag non-static or letting its address escape "fixes" it. I am using GCC 14.1.1, but this is reproducible for all version since at least 4.9.2. I have noticed that Clang shares similar issue, but only since Clang 13.