On Tue, 2014-01-14 at 17:06 -0800, Davidlohr Bueso wrote: > On Tue, 2014-01-14 at 16:33 -0800, Jason Low wrote: > > When running workloads that have high contention in mutexes on an 8 socket > > machine, spinners would often spin for a long time with no lock owner. > > > > One of the potential reasons for this is because a thread can be preempted > > after clearing lock->owner but before releasing the lock > > What happens if you invert the order here? So mutex_clear_owner() is > called after the actual unlocking (__mutex_fastpath_unlock).
Reversing the mutex_fastpath_unlock and mutex_clear_owner resulted in a 20+% performance improvement to Ingo's test-mutex application at 160 threads on an 8 socket box. I have tried this method before, but what I was initially concerned about with clearing the owner after unlocking was that the following scenario may occur. thread 1 releases the lock thread 2 acquires the lock (in the fastpath) thread 2 sets the owner thread 1 clears owner In this situation, lock owner is NULL but thread 2 has the lock. > > or preempted after > > acquiring the mutex but before setting lock->owner. > > That would be the case _only_ for the fastpath. For the slowpath > (including optimistic spinning) preemption is already disabled at that > point. Right, for just the fastpath_lock. > > In those cases, the > > spinner cannot check if owner is not on_cpu because lock->owner is NULL. > > > > A solution that would address the preemption part of this problem would > > be to disable preemption between acquiring/releasing the mutex and > > setting/clearing the lock->owner. However, that will require adding overhead > > to the mutex fastpath. > > It's not uncommon to disable preemption in hotpaths, the overhead should > be quite smaller, actually. > > > > > The solution used in this patch is to limit the # of times thread can spin > > on > > lock->count when !owner. > > > > The threshold used in this patch for each spinner was 128, which appeared to > > be a generous value, but any suggestions on another method to determine > > the threshold are welcomed. > > Hmm generous compared to what? Could you elaborate further on how you > reached this value? These kind of magic numbers have produced > significant debate in the past. I've observed that when running workloads which don't exhibit this behavior (long spins with no owner), threads rarely take more than 100 extra spins. So I went with 128 based on those number. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/