On Thu, Jun 09, 2016 at 07:36:40PM +0200, Peter Zijlstra wrote: > On Thu, Jun 09, 2016 at 03:40:58PM +0100, David Howells wrote: > > Peter Zijlstra <pet...@infradead.org> wrote: > > > > > Blergh; so looking at more asm there's still a few tricks we cannot do. > > > So while overall size is down, some paths do end up more expensive. (It > > > typically boils down to creative use of condition flags, which is very > > > hard in C) > > > > It can be done using ISO __atomic_fetch_add() and suchlike. > > (ISO-C11, ISO as such is a bad abbreviation I think) > > Maybe, but we're almost there with __GCC_ASM_FLAG_OUTPUTS__. > > atomic_long_add_negative() can be made to do inc;j(n)s for __down_read. > > the try_cmpxchg family you wanted to add independent from the ISO-C11 > bits can do the cmpxchg-j(n)z for __down_{read,write}_trylock. > > That only leaves us wanting an atomic_long_fetch_add_negative() for > __up_{read,write}(). > > Although I suppose, for this to be of use for our weakly ordered > friends, we need _relaxed versions of all that (so that _acquire and > _release variants are generated).
Historically, the compilers have won this sort of contest over the long term. That said, there is nothing quite like raising the bar for them to help them generate decent code. So, David and Peter, I am behind both of you 100%. ;-) Thanx, Paul