On Nov 20 2025, at 7:03 pm, Andres Freund <[email protected]> wrote:
> Hi, > > On 2025-11-20 15:45:22 -0500, Greg Burd wrote: >> Dave and I have been working together to get ARM64 with MSVC functional. >> The attached patches accomplish that. Dave is the author of the first >> which addresses some build issues and fixes the spin_delay() semantics, >> I did the second which fixes some atomics in this combination. > > Thanks for working on this! You're welcome, thanks for reviewing it. :) >> >> MSVC's _InterlockedCompareExchange() intrinsic on ARM64 performs the >> atomic operation but does NOT emit the necessary Data Memory Barrier >> (DMB) instructions [4][5]. > > I couldn't reproduce this result when playing around on godbolt. By specifying > /arch:armv9.4 msvc can be convinced to emit the code for the > intrinsics inline > (at least for most of them). And that makes it visible that > _InterlockedCompareExchange() results in a "casal" instruction. > Looking that > up shows: > > https://developer.arm.com/documentation/dui0801/l/A64-Data-Transfer-Instructions/CASA--CASAL--CAS--CASL--CASAL--CAS--CASL--A64- > which includes these two statements: > "CASA and CASAL load from memory with acquire semantics." > "CASL and CASAL store to memory with release semantics." I didn't even think to check for a compiler flag for the architecture, nice call! If this emits the correct instructions it is a much better approach. I'll give it a try, thanks for the nudge. >> Issue 2: S_UNLOCK() uses only a compiler barrier >> >> _ReadWriteBarrier() is a compiler barrier, NOT a hardware memory >> barrier [6]. It prevents the compiler from reordering operations, but >> the CPU can still reorder memory operations. This is fundamentally >> insufficient for ARM64's weaker memory model. > > Yea, that seems broken on a non-TSO architecture. Is the problem > fixed if you change just this to include a proper barrier? Using the flag from above the _ReadWriteBarrier() does (in godbolt) turn into a casal which (AFAIK) is going to do the trick. I'll see if I can update meson.build and get this work as intended. > Greetings, > > Andres Freund best. -greg
