http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51766
--- Comment #4 from Richard Guenther <rguenth at gcc dot gnu.org> 2012-01-10 09:43:16 UTC --- (In reply to comment #3) > > It says above them "In most cases, these > > builtins are considered a full barrier." and only __sync_lock_test_and_set > > and > > __sync_lock_release specify different barrier semantics. > > The next sentence is: > > "That is, no memory operand will be moved across the operation, either forward > or > backward." > > Note that this refers to memory operands, not memory operations -- memory > stores and memory loads referenced in documentation of the other sync > builtins. > In other words, one could interpret "full memory barrier" as: > > asm volatile ("" : : : "memory"); > > that refers to a GCC scheduling barrier. > > The GCC documentation references Intel processors, which do not have have a > distinction between instructions for RELEASE, ACQ_REL and SEQ_CST semantics. > > The basic problem is that the GCC builtins and atomic instruction semantics > were designed for Intel processors that do not provide the level of > granularity > implemented in POWER processors. The POWER port implemented lighter weight > ACQ_REL semantics. Retrofitting the original builtins on the new C++11 memory > model semantics and imposing SEQ_CST interpretation has changed the behavior > and performance on POWER, but not on other targets. But for more precise semantics we now have the __atomic_* builtins, right? And the __sync_* ones are deprecated. I don't see how we can preserve old behavior for the __sync_* ones without adding a new target hook. The documentation would need to be adjusted, of course, I'm not sure that different atomic semantics are "useful" for these "portable" synchronization primitives? Thus, fixing the libstdc++ side seems worthwhile, but I'm not sure with respect to the deprecated __sync builtins?