I took a look at apr's configure.in, and it's default for all architectures except for i486, i586, and i686 is to use the real atomic ops, but for those three architectures the default is to use the "generic" atomic ops. Any idea why there is a special rule for those three architectures? There's nothing wrong with the atomic operations on those three architectures: otherwise, how have we had semaphores and mutexes for all these years on those CPUs? I guess that is a question for the APR dev mailing list.
I see that some distros override that default. E.g., the libapr1.spec for openSUSE has: %ifarch %ix86 --enable-nonportable-atomics=yes \ %endif and in /usr/lib/rpm/macros: On Tue, Dec 3, 2013 at 12:54 PM, Yann Ylavic <ylavic....@gmail.com> wrote: > I personnally like this solution better (IMHO) since it does not rely on > apr_thread_mutex_trylock() to be wait-free/userspace (eg. natively > implements the "compare and swap"). > > On the other hand, apr_atomic_cas32() may itself be implemented using > apr_thread_mutex_lock() when USE_ATOMICS_GENERIC is defined (explicitly, or > with --enable-nonportable-atomics=no, or else forcibly with "gcc -stdc=c89" > or intel cpus <= i686). > > Hence with USE_ATOMICS_GENERIC, apr_thread_mutex_trylock() may be a better > solution than the apr_thread_mutex_lock()... > > > > On Tue, Dec 3, 2013 at 6:01 PM, Daniel Lescohier < > daniel.lescoh...@cbsi.com> wrote: > >> If the developers list is OK using apr_atomic in the server core, there >> would be lots of advantages over trylock: >> >> 1. No need for child init. >> 2. No need for function pointers. >> 3. Could have a lock per cache element (I deemed it too expensive >> memory-wise to have a large mutex structure per cache element). >> 4. It would avoid the problem of trylock not being implemented on all >> platforms. >> 5. Fewer parameters to the function macro. >> >> The code would be like this: >> >> #define TIME_CACHE_FUNCTION(VALUE_SIZE, CACHE_T, CACHE_PTR, >> CACHE_SIZE_POWER,\ >> CALC_FUNC, AFTER_READ_WORK\ >> >> )\ >> const apr_int64_t seconds = apr_time_sec(t);\ >> apr_status_t status;\ >> CACHE_T * const cache_element = \ >> &(CACHE_PTR[seconds & ((1<<CACHE_SIZE_POWER)-1)]);\ >> /* seconds==0 can be confused with unitialized cache; don't use cache >> */\ >> if (seconds==0) return CALC_FUNC(value, t);\ >> if (apr_atomic_cas32(&cache_element->lock, 1, 0)==0) {\ >> >> if (seconds == cache_element->key) {\ >> memcpy(value, &cache_element->value, VALUE_SIZE);\ >> apr_atomic_dec32(&cache_element->lock);\ >> >> AFTER_READ_WORK;\ >> return APR_SUCCESS;\ >> }\ >> if (seconds < cache_element->key) {\ >> apr_atomic_dec32(&cache_element->lock);\ >> return CALC_FUNC(value, t);\ >> }\ >> apr_atomic_dec32(&cache_element->lock);\ >> >> }\ >> status = CALC_FUNC(value, t);\ >> if (status == APR_SUCCESS) {\ >> if (apr_atomic_cas32(&cache_element->lock, 1, 0)==0) {\ >> >> if (seconds > cache_element->key) {\ >> cache_element->key = seconds;\ >> memcpy(&cache_element->value, value, VALUE_SIZE);\ >> }\ >> apr_atomic_dec32(&cache_element->lock);\ >> }\ >> }\ >> return status; >> >> -------------------------------------------------- >> >> typedef struct { >> apr_int64_t key; >> apr_uint32_t lock; >> >> apr_time_exp_t value; >> } explode_time_cache_t; >> >> TIME_CACHE(explode_time_cache_t, explode_time_lt_cache, >> TIME_CACHE_SIZE_POWER) >> >> AP_DECLARE(apr_status_t) ap_explode_recent_localtime( >> apr_time_exp_t * value, apr_time_t t) >> { >> TIME_CACHE_FUNCTION( >> >> sizeof(apr_time_exp_t), explode_time_cache_t, explode_time_lt_cache, >> TIME_CACHE_SIZE_POWER, apr_time_exp_lt, >> value->tm_usec = (apr_int32_t) apr_time_usec(t)) >> } >> >