On Wed, 7 Sep 2016 14:51:47 +0100 Will Deacon <will.dea...@arm.com> wrote:
> On Wed, Sep 07, 2016 at 03:23:54PM +0200, Peter Zijlstra wrote: > > On Wed, Sep 07, 2016 at 10:17:26PM +1000, Nicholas Piggin wrote: > > > It seems okay, but why not make it a special sched-only function name > > > to prevent it being used in generic code? > > > > > > I would not mind seeing responsibility for the switch barrier moved to > > > generic context switch code like this (alternative for powerpc reducing > > > number of hwsync instructions was to add documentation and warnings about > > > the barriers in arch dependent and independent code). And pairing it with > > > a spinlock is reasonable. > > > > > > It may not strictly be an "smp_" style of barrier if MMIO accesses are to > > > be ordered here too, despite critical section may only be providing > > > acquire/release for cacheable memory, so maybe it's slightly more > > > complicated than just cacheable RCsc? > > > > Interesting idea.. > > > > So I'm not a fan of that raw_spin_lock wrapper, since that would end up > > with a lot more boiler-plate code than just the one extra barrier. > > > > But moving MMIO/DMA/TLB etc.. barriers into this spinlock might not be a > > good idea, since those are typically fairly heavy barriers, and its > > quite common to call schedule() without ending up in switch_to(). > > > > For PowerPC it works out, since there's only SYNC, no other option > > afaik. > > > > But ARM/ARM64 will have to do DSB(ISH) instead of DMB(ISH). IA64 would > > need to issue "sync.i" and mips-octeon "synciobdma". > > > > Will, any idea of the extra cost involved in DSB vs DMB? > > DSB is *much* more expensive, since it completes out-of-band communication > such as MMIO accesses and TLB invalidation, as well as plain old memory > accesses. > > The only reason we have DSB in our __switch_to code is to complete cache > maintenance in case the task is going to migrate to another CPU; there's > just no way to know that at the point we need to do the barrier :( Unfortunately it's not trivial to move such barriers to migrate-time, because the source CPU may not be involved after the task is switched out. This won't prevent ARM32/64 from continuing to do what it does today, if we note that the arch must provide such barriers *either* in the context switch lock / barrier, or in its own switch code. Thanks, Nick