On Tue, Oct 8, 2019 at 1:10 PM <i...@apache.org> wrote: > > Author: ivan > Date: Tue Oct 8 11:10:32 2019 > New Revision: 1868129 > > URL: http://svn.apache.org/viewvc?rev=1868129&view=rev > Log: > apr_atomic_read64(): Fix non-atomic read on 32-bit Windows. [] > --- apr/apr/trunk/atomic/win32/apr_atomic64.c (original) > +++ apr/apr/trunk/atomic/win32/apr_atomic64.c Tue Oct 8 11:10:32 2019 > @@ -51,7 +51,9 @@ APR_DECLARE(void) apr_atomic_set64(volat > > APR_DECLARE(apr_uint64_t) apr_atomic_read64(volatile apr_uint64_t *mem) > { > - return *mem; > + /* 64-bit read is not atomic on 32-bit platform: use > InterlockedCompareExchange > + to perform atomic read. */ > + return InterlockedCompareExchange64((volatile LONG64 *)mem, 0, 0); > }
Shouldn't we make this #ifdef'd on WIN32/WIN64? Despite the win32 dirname, I think both Windows(es) share the same atomic/ code (same applies to apr_atomic_set64() I think, where *mem = val should be fine on WIN64). Then I don't know if we care enough about WIN32, but if so the InterlockedCompareExchange64() call here is maybe a bit of a hammer for usual apr_atomic_read64() usage. For things like lightly check for the current 64bit shared value and do the heavy CAS work for some special/unlikely condition only, this change is going to kill performances. Since we provide _cas64() already, I fell that read64/set64() should always be lightweight (when/if possible). I know that [2] says "Simple reads and writes to properly aligned 64-bit variables are atomic on 64-bit Windows. Reads and writes to 64-bit values are not guaranteed to be atomic on 32-bit Windows.", but it looks like in practice for CPUs that Windows cares about, an aligned 8-bytes (eg. our apr_uint64_t type) load/store can always be atomic. The load/store are likely compiled using x87 registers or xmm or sse or at worse lock cmpxchg8b (i.e. the above), and supposedly some 64bit or more registers available on ARM too. I'm no Windows expert, but wonder if we could use inline ReadAcquire64/WriteRelease64() functions from winnt.h/wdm.h. They seem to pair with Interlocked* builtins (same API), and have the semantics we want. I found them by naïvely (and unsuccessfully) searching for InterlockedRead/InterlockedWrite() functions, and falling into the discussion in [1]. It seems that there is no such functions in the windows API because .. there are not needed (well, IMHO, if atomic load64/store64 should be implemented using cas64, they look needed to me..). This seems to be confirmed by the "winnt.h" file I downloaded from the net, for the CPUs handled (and cared about) there, ReadAcquire64/WriteRelease64() are implemented using volatile access (i.e. *(volatile LONG64*)mem [= val]), plus a barrier for arm(s?), so relying on the native acquire/release semantics from the CPU/cache. So possibly r1868129 is not necessary, or if ReadAcquire64/WriteRelease64() can be used easily could we do that instead? (having a look at the WIN32 assembly code for apr_atomic_read64/set64() using ReadAcquire64/WriteRelease64() could be instructive, though) Regards, Yann. [1] https://community.osr.com/discussion/comment/254905/#Comment_254905 [2] https://docs.microsoft.com/en-us/windows/win32/sync/interlocked-variable-access