for those without much mwait experience, mwait is a kernel-only primitive
(as per the instructions) that pauses the processor until a change has been
made in some range of memory.  the size is determined by probing the h/w,
but think cacheline.  so the discussion of locking is kernel specific as well.

> > On 17 Dec 2013, at 12:00, cinap_len...@felloff.net wrote:
> > 
> > thats a surprising result. by dog pile lock you mean the runq spinlock no?
> > 
> 
> I guess it depends on the HW, but I donĀ“t find that so surprising. You are 
> looping
> sending messages to the coherency fabric, which gets congested as a result.
> I have seen that happen.

i assume you mean that there is contention on the cacheline holding the runq 
lock?
i don't think there's classical congestion.  as i believe cachelines not 
involved in the
mwait would experience no hold up.

> You should back off, but sleeping for a fixed time is not a good solution 
> either.
> Mwait is a perfect solution in this case, there is some latency, but you are 
> in a bad
> place anyway and with mwait, performance does not degrade too much.

mwait() does improve things and one would expect the latency to always be better
than spining*.  but as it turns out the current scheduler is pretty hopeless in 
its locking
anyway.  simply grabbing the lock with lock rather than canlock makes more 
sense to me.

also, using ticket locks (see 9atom nix kernel) will provide automatic backoff 
within the lock.
ticket locks are a poor solution as they're not really scalable but they will 
scale to 24 cpus
much better than tas locks.

mcs locks or some other queueing-style lock is clearly the long-term solution.  
but as
charles points out one would really perfer to figure out a way to fit them to 
the lock
api.  i have some test code, but testing queueing locks in user space is ... 
interesting.
i need a new approach.

- erik

* have you done tests on this?

Reply via email to