On Thu, Jul 06, 2017 at 06:41:34PM +0200, Peter Zijlstra wrote: > On Thu, Jul 06, 2017 at 09:24:12AM -0700, Paul E. McKenney wrote: > > On Thu, Jul 06, 2017 at 06:10:47PM +0200, Peter Zijlstra wrote: > > > On Thu, Jul 06, 2017 at 08:21:10AM -0700, Paul E. McKenney wrote: > > > > And yes, there are architecture-specific optimizations for an > > > > empty spin_lock()/spin_unlock() critical section, and the current > > > > arch_spin_unlock_wait() implementations show some of these > > > > optimizations. > > > > But I expect that performance benefits would need to be demonstrated at > > > > the system level. > > > > > > I do in fact contended there are any optimizations for the exact > > > lock+unlock semantics. > > > > You lost me on this one. > > For the exact semantics you'd have to fully participate in the fairness > protocol. You have to in fact acquire the lock in order to have the > other contending CPUs wait (otherwise my earlier case 3 will fail). > > At that point I'm not sure there is much actual code you can leave out. > > What actual optimization is there left at that point?
Got it. It was just that I was having a hard time parsing your sentence. You were contending that there are no optimizations for all implementations for the full semantics. Thanx, Paul