> It suffers the typical problems all those constructs do; namely it > wrecks accountability.
That's "government thinking" ;-) - for most real users throughput is more important than accountability. With the right API it ought to also be compile time switchable. > But here that is compounded by the fact that you inject other people's > work into 'your' lock region, thereby bloating lock hold times. Worse, > afaict (from a quick reading) there really isn't a bound on the amount > of work you inject. That should be relatively easy to fix but for this kind of lock you normally get the big wins from stuff that is only a short amount of executing code. The fairness your trade in the cases it is useful should be tiny except under extreme load, where the "accountability first" behaviour would be to fall over in a heap. If your "lock" involves a lot of work then it probably should be a work queue or not using this kind of locking. > And while its a cute collapse of an MCS lock and lockless list style > work queue (MCS after all is a lockless list), saving a few cycles from > the naive spinlock+llist implementation of the same thing, I really > do not see enough justification for any of this. I've only personally dealt with such locks in the embedded space but there it was a lot more than a few cycles because you go from take lock spins pull things into cache do stuff cache lines go write/exclusive unlock take lock move all the cache do stuff etc to take lock queue work pull things into cache do work 1 caches line go write/exclusive do work 2 unlock done and for the kind of stuff you apply those locks you got big improvements. Even on crappy little embedded processors cache bouncing hurts. Even better work merging locks like this tend to improve throughput more the higher the contention unlike most other lock types. The claim in the original post is 3x performance but doesn't explain performance doing what, or which kernel locks were switched and what patches were used. I don't find the numbers hard to believe for a big big box, but I'd like to see the actual use case patches so it can be benched with other workloads and also for latency and the like. Alan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/