Ingo Molnar <[EMAIL PROTECTED]> writes:
>
> interestingly, the x86 spinlock implementation uses a LOCK-ed
> instruction only on acquire - it uses a simple atomic write (and
> implicit barrier assumption) on the way out:
>
>  #define spin_unlock_string \
>          "movb $1,%0" \
>                  :"=m" (lock->slock) : : "memory"
>
> no LOCK prefix. Due to this spinlocks can sometimes be _cheaper_ than
> doing the same via atomic inc/dec.

Unfortunately kernels are often compiled for PPro and on those
an LOCK prefix is used anyways to work around some bugs in early 
steppings. This makes spinlocks considerably slower (there are some
lock intensive not even so micro benchmarks that show the difference clearly)

It uses then

#define spin_unlock_string \
        "xchgb %b0, %1" \
                :"=q" (oldval), "=m" (lock->lock) \
                :"0" (oldval) : "memory"


which has an implicit LOCK and is equally slow.

I looked some time ago at patching it at runtime using alternative(),
but it would have bloated the patch tables a lot. Another way would
be a CONFIG_PPRO_BUT_UP_ON_BUGGY_ONES, but it is hard to find the exact
steppings with the problems.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to