On 11/29/2011 04:49 AM, Alan Modra wrote: > Since there were a number of changes requested, and some confusion > over what was happening in this code, it might be better if I post > a new patch and we start over. In this patch I'm still using the MSB > for the wait flag, and count in the low 31 bits, as that way is > slightly more efficient on powerpc. However, if some other target > generates really bad code for the masking operations, then I'm happy > enough to change the comment and use > #define SEM_WAIT 1 > #define SEM_INC 2
Agreed we can wait for more data for this. I think readability of the file with these two names is significantly better already. > +static inline bool > +likely_compare_exchange (int *ptr, int *expected, int val, bool weak, > + enum memmodel succ, enum memmodel fail) > { > + return __builtin_expect (__atomic_compare_exchange_n (ptr, expected, val, > + weak, succ, fail), 1); > } Please move this to libgomp.h just below the memmodel definition, as a macro so that it's not tied to int. It seems likely that we'll want this elsewhere in libgomp as we convert things. > +gomp_sem_wait (gomp_sem_t *sem) > { > + int count = *sem; > + > + while ((count & ~SEM_WAIT) != 0) > + if (likely_compare_exchange (sem, &count, count - SEM_INC, > + false, MEMMODEL_ACQUIRE, MEMMODEL_RELAXED)) > + return; Tight loops like this should use the weak compare-exchange so that we don't get two nested loops without need. > +gomp_sem_post (gomp_sem_t *sem) > +{ > + int count = *sem; > + > + while (1) > + if (likely_compare_exchange (sem, &count, ((count + SEM_INC) & > ~SEM_WAIT), > + false, MEMMODEL_RELEASE, MEMMODEL_RELAXED)) > + break; Likewise. And this morning I finally follow the logic. I agree it's correct. I do think we need more commentary here so that next month I can still follow it. > +gomp_sem_wait_slow (gomp_sem_t *sem, int count) > { > + /* First loop spins a while. */ > + while (count == 0) > + if (do_spin (sem, 0) > + /* Spin timeout, nothing changed. Set waiting flag. */ > + && likely_compare_exchange (sem, &count, SEM_WAIT, > + false, MEMMODEL_ACQUIRE, MEMMODEL_RELAXED)) > + { For the record, here I agree we should use the strong CAS, because the CAS is not the only thing in the loop. We don't want a syscall just because the store-conditional failed. Ok with those changes. r~