Ingo Molnar <[EMAIL PROTECTED]> writes: > > interestingly, the x86 spinlock implementation uses a LOCK-ed > instruction only on acquire - it uses a simple atomic write (and > implicit barrier assumption) on the way out: > > #define spin_unlock_string \ > "movb $1,%0" \ > :"=m" (lock->slock) : : "memory" > > no LOCK prefix. Due to this spinlocks can sometimes be _cheaper_ than > doing the same via atomic inc/dec.
Unfortunately kernels are often compiled for PPro and on those an LOCK prefix is used anyways to work around some bugs in early steppings. This makes spinlocks considerably slower (there are some lock intensive not even so micro benchmarks that show the difference clearly) It uses then #define spin_unlock_string \ "xchgb %b0, %1" \ :"=q" (oldval), "=m" (lock->lock) \ :"0" (oldval) : "memory" which has an implicit LOCK and is equally slow. I looked some time ago at patching it at runtime using alternative(), but it would have bloated the patch tables a lot. Another way would be a CONFIG_PPRO_BUT_UP_ON_BUGGY_ONES, but it is hard to find the exact steppings with the problems. -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/