| Issue |
179390
|
| Summary |
LOCK CMPXCHG16B only inlined for std::atomic<T> when alignof(T)==16 with libstdc++, even though atomic<T> aligns itself.
|
| Labels |
new issue
|
| Assignees |
|
| Reporter |
pcordes
|
```
#include <atomic>
#include <cstdint>
struct Foo {
// alignas(16) ////////// Without this, clang + libstdc++ won't inline lock cmpxchg16b
std::uint64_t tick = 0;
std::uint64_t tock = 0;
};
std::atomic<Foo> global_var;
void foo() {
Foo foo = {0, 42};
global_var.compare_exchange_strong(foo, Foo{}, std::memory_order_relaxed);
}
```
https://godbolt.org/z/PsWqqvGqT
With the default libstdc++ on GNU/Linux, clang++ -O3 (trunk or 21.1) compiles this to a call to `__atomic_compare_exchange`, even with `-O3 -mcx16 -mavx`. It does still inline load/store as `vmovaps` which of course requires 16-byte alignment to not fault, and `alignof(atomic<Foo>) == 16` so libstdc++ must be using alignas internally somehow.
With libc++, or uncommenting `alignas(16)`, the CAS inlines as `lock cmpxchg16b` as expected, with the same compilers and options.
I don't know what libstdc++ does differently which breaks things.
ICX 2025.3.1 on Godbolt has the same behaviour, only inlining `lock cmpxchg16b` with the base struct being 16-byte aligned.
I also tested clang 20.1.8 on my own desktop to make sure it's not just something weird with Godbolt's libstdc++ install. Same lack of inlining unless I manually align the 16-byte type.
Reproducible on Clang as old as 3.5. Before that CAS doesn't inline even with alignas(16). And with any `-std=c++11` to 23.
Given that alignment guarantee, we know that `lock cmpxchg16b` definitely won't cause a system-wide bus lock, so inlining it for RMW ops is a good thing.
Discovered by Davislor, on https://stackoverflow.com/questions/79880987/why-is-an-atomic-struct-of-two-64-bit-variables-not-lock-free-on-x86/79881533#79881533
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs