http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47031
--- Comment #7 from Nicola Pero <nicola at gcc dot gnu.org> 2011-01-08 11:39:38 UTC --- > Usually, the lock is not held. If it is, you do a little trick: You spin 10 > times and if you still could not get the lock, it's likely the current thread > is blocking another thread from releasing the spinlock. Again, quite unlikely, > as the spinlock is only held for an extremely short amount of time. However, > if it happens that after 10 spins you still could not get the lock, you call > sched_yield() to NOT waste resources. > > So, in the worst case, you waste 10 spins. That's basically 10 compares. > That's nothing compared to a user/kernelspace switch, which is often 10 times > more. Well, but locking a mutex on Linux is implemented on top of futexes and does not require a user/kernelspace switch unless the lock is already held (in which case a spinlock requires a switch too). ;-) So, basically on Linux the standard mutexes are already optimized and perform not as fast but almost as fast as spinlocks in the uncontended case, but without the problems of spinlocks in the contented case (my benchmarks confirm that; there is nothing like the 10x difference you mention in the uncontented case). :-) Maybe you benchmarked or used other platforms in the past; and you may have a very good point there. If objc_mutex_lock() and objc_mutex_unlock() actually do always perform a system call each on some systems, the mutex-protected accessor could be so much slower (100x ?) than the spinlock-protected accessor (in the non-contented case) that it may make sense to multiply the number of accessor locks (say, to 64) to reduce the chance of contention and then use spinlocks there. :-) On the other hand, mutexes are easy to port, have been ported and are known to work well out of the box, so in terms of maintenance of other platforms I wouldn't mind sticking with them for all the other, less-used platforms too. They may not be fast, but at least they always work. ;-) It would still be good to try a worst-case benchmark of spinlocks in the highly contended case. I am assuming the performance would be really really bad, but then I may just be wrong. ;-) Thanks