https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69146
kelvin at gcc dot gnu.org changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |kelvin at gcc dot gnu.org --- Comment #4 from kelvin at gcc dot gnu.org --- The spec for __sync_bool_compare_and_swap can be found at https://gcc.gnu.org/onlinedocs/gcc-4.4.5/gcc/Atomic-Builtins.html This states that the first argument may be a pointer to "any integral scalar or pointer type that is 1, 2, 4 or 8 bytes in length", but acknowledges that "Intel documentation allows only for the use of the types int, long, long long as well as their unsigned counterparts." In discussing implementation options, the "spec" says: "Not all operations are supported by all target processors. If a particular operation cannot be implemented on the target processor, a warning will be generated and a call an external function will be generated. The external function will carry the same name as the builtin, with an additional suffix `_n' where n is the size of the data type." On Power, we would need a different implementation depending on whether the address of the modified data value is properly aligned. To use, for example, the lharx and starx instructions, we would need an assurance that the address is a multiple of two. In this case, the compiler can see that it is. However, there are also situations in which the compiler can see that the argument is not a multiple of 2, such as the following code, in which case we would presumably need to generate a call to a helper function rather than emit in-line code: class A { bool m_fn1(); bool b; short m_role; } __attribute ((packed)); char a; bool A::m_fn1() { __sync_bool_compare_and_swap(&m_role, -1, a); And there are situations in which the compiler cannot know for sure whether the address associated with the first argument is properly aligned, such as: char a; bool __generic_atomizer(short *sp) { return __sync_bool_compare_and_swap(sp, -1, a); } In general, the compiler will not "know" that a particular shared data value is always accessed in the same way by all threads. That is, some threads may access the value from contexts that are known to the compiler to be aligned and other threads may access the same value from contexts that cannot determine the value to be properly aligned. So there would need to be some sort of implementation compatibility between the in-lined implementation and the function-call implementation. Maybe this is all obvious to implementers and the details need not be belabored. I "feel" the necessity to add this comment because the lack of any discussion about these issues in the "API specification" leaves me with concerns that application programmers may not know how to use this API correctly. What should happen, for example, with the following source? class A { bool i_fn1(); bool b; double d; long int i; } __attribute__((packed)); long int larry; bool A::i_fn1() { __sync_bool_compare_and_swap(&i, -1, larry); } I would be inclined to prohibit this usage, but there is nothing in the API description that allows me to do so.