On Sat, Jul 29, 2017 at 11:58:40AM +1000, Nicholas Piggin wrote: > I haven't had time to read the thread and understand exactly why you need > this extra barrier, I'll do it next week. Thanks for cc'ing us on it.
Bottom of here: https://lkml.kernel.org/r/[email protected] is probably the fastest way towards understanding the need for a barrier after rq->curr assignment. Any barrier after that assignment is good for us, but so far it looks like PPC doesn't (and PPC only afaict) provide any smp_mb() after that point. > A smp_mb is pretty expensive on powerpc CPUs. Removing the sync from > switch_to increased thread switch performance by 2-3%. Putting it in > switch_mm may be a little less painful, but still we have to weigh it > against the benefit of this new functionality. Would that be a net win > for the average end-user? Seems unlikely. > > But we also don't want to lose sys_membarrier completely. Would it be too > painful to make MEMBARRIER_CMD_PRIVATE_EXPEDITED return error, or make it > fall back to a slower case if we decide not to implement it? One ugly thing we've thought of is tagging each mm that has used sys_membarrier() and only issue the smp_mb() for those. That way only those tasks that actually rely on the syscall get to pay the price.

