On Tue, Jul 17, 2018 at 12:40:19AM +1000, Michael Ellerman wrote: > > I guess arguably it's not a very macro benchmark, but we have a > context_switch benchmark in the tree[1] which we often use to tune > things, and it degrades badly. It just spins up two threads and has them > ping-pong using yield.
The one advantage you'd get from putting it in lock() is that you could do away with smp_mb__after_spinlock(). But yes, I completely forgot about your IO thingy.. those bench results make me sad :/ a well.