barrier.h

Leonid Yegoshin Wed, 13 Jan 2016 11:07:13 -0800

(I try to answer on multiple mails in one)

First of all, it seems like some generic notes should be given here:

1. Generic MIPS "SYNC" (aka "SYNC 0") instruction is a very heavy insome CPUs. On that CPUs it basically kills pipelines in each CPU, can doa special memory/IO bus transaction (similar to "fence") and hold asystem until all R/W is completed. It is like Big Kernel Lock but worse.So, the move to SMP_* kind of barriers is needed to improve performance,especially on newest CPUs with long pipelines.

2. MIPS Arch document may be misleading because words "ordering" and"completion" means different from Linux, the SYNC instructiondescription is written for HW engineers. I wrote that in a separatepatch of the same patchset -http://patchwork.linux-mips.org/patch/10505/ "MIPS: R6: Use lightweightSYNC instruction in smp_* memory barriers":

This instructions were specifically designed to work for smp_*() sort of
memory barriers in MIPS R2/R3/R5 and R6.

Unfortunately, it's description is very cryptic and is done in HW engineering
style which prevents use of it by SW.

3. I bother MIPS Arch team long time until I completely understood thatMIPS SYNC_WMB, SYNC_MB, SYNC_RMB, SYNC_RELEASE and SYNC_ACQUIRE do anexactly that is required in Documentation/memory-barriers.txt



In Peter Zijlstra mail:

1) you do not make such things selectable; either the hardware needs
them or it doesn't. If it does you_must_  use them, however unlikely.

It is selectable only for MIPS R2 but not MIPS R6. The reason is - mostof MIPS R2 CPUs have short pipeline and that SYNC is just waste of CPUresource, especially taking into account that "lightweight syncs" areconverted to a heavy "SYNC 0" in many of that CPUs. However the latestMIPS/Imagination CPU have a pipeline long enough to hit a problem -absence of SYNC at LL/SC inside atomics, barriers etc.

And reading the MIPS64 v6.04 instruction set manual, I think 0x11/0x12
are_NOT_  transitive and therefore cannot be used to implement the
smp_mb__{before,after} stuff.

That is, in MIPS speak, those SYNC types are Ordering Barriers, not
Completion Barriers.


Please see above, point 2.

That is, currently all architectures -- with exception of PPC -- have
RCsc locks, but using these non-transitive things will get you RCpc
locks.

So yes, MIPS can go RCpc for its locks and share the burden of pain with
PPC, but that needs to be a very concious decision.

I don't understand that - I tried hard but I can't find any word like"RCsc", "RCpc" in Documents/ directory. Web search goes nowhere, of course.



In Will Deacon mail:

The issue I have with the SYNC description in the text above is that it
describes the single CPU (program order) and the dual-CPU (confusingly
named global order) cases, but then doesn't generalise any further. That
means we can't sensibly reason about transitivity properties when a third
agent is involved. For example, the WRC+sync+addr test:


P0:
Wx = 1

P1:
Rx == 1
SYNC
Wy = 1

P2:
Ry == 1
<address dep>
Rx = 0


I can't find anything to forbid that, given the text. The main problem
is having the SYNC on P1 affect the write by P0.

As I understand that test, the visibility of P0: W[x] = 1 is identicalto P1 and P2 here. If P1 got X before SYNC and write to Y after SYNCthen instruction source register dependency tracking in P2 prevents aspeculative load of X before P2 obtains Y from the same place as P0/P1and calculate address of X. If some load of X in P2 happens beforeaddress dependency calculation it's result is discarded.

Yes, you can't find that in MIPS SYNC instruction description, it ismore likely in CM (Coherence Manager) area. I just pointed our arch teammember responsible for documents and he will think how to explain that.


- Leonid.



_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [v3,11/41] mips: reuse asm-generic/barrier.h

Reply via email to