* Andrea Parri (and...@rivosinc.com) wrote: > > > > Is x86's brand of memory ordering strong enough for Ztso? > > > > I thought x86 had an optimisation where it was allowed to store forward > > > > within the current CPU causing stores not to be quite strictly ordered. > > [...] > > > then a bit further down, '8.2.3.5 Intra-Processor Forwarding Is Allowed' > > has an example and says > > > > 'The memory-ordering model allows concurrent stores by two processors > > to be seen in > > different orders by those two processors; specifically, each processor > > may perceive > > its own store occurring before that of the other.' > > > > Having said that, I remember it's realyl difficult to trigger; it's ~10 > > years since I saw an example to trigger it, and can't remember it. > > AFAICT, Ztso allows the forwarding in question too. Simulations with > the axiomatic formalization confirm such expectation:
OK that seems to be what it says in: https://five-embeddev.com/riscv-isa-manual/latest/ztso.html 'In both of these memory models, it is the that allows a hart to forward a value from its store buffer to a subsequent (in program order) load—that is to say that stores can be forwarded locally before they are visible to other harts' > RISCV intra-processor-forwarding > { > 0:x5=1; 0:x6=x; 0:x8=y; > 1:x5=1; 1:x6=y; 1:x8=x; > } > P0 | P1 ; > sw x5,0(x6) | sw x5,0(x6) ; > lw x9,0(x6) | lw x9,0(x6) ; > lw x7,0(x8) | lw x7,0(x8) ; > exists > (0:x7=0 /\ 1:x7=0 /\ 0:x9=1 /\ 1:x9=1) (I'm a bit fuzzy reading this...) So is that the interesting case - where x7 is saying neither processor saw the other processors write yet, but they did see their own? So from a qemu patch perspective, I think the important thing is that the flag that's defined, is defined and commented in such a way that it's obvious that local forwarding is allowed; we wouldn't want someone emulating a stricter CPU (that doesn't allow local forwarding) to go and use this flag as an indication that the host cpu is that strict. Dave > Test intra-processor-forwarding Allowed > States 4 > 0:x7=0; 0:x9=1; 1:x7=0; 1:x9=1; > 0:x7=0; 0:x9=1; 1:x7=1; 1:x9=1; > 0:x7=1; 0:x9=1; 1:x7=0; 1:x9=1; > 0:x7=1; 0:x9=1; 1:x7=1; 1:x9=1; > Ok > Witnesses > Positive: 1 Negative: 3 > Condition exists (0:x7=0 /\ 1:x7=0 /\ 0:x9=1 /\ 1:x9=1) > Observation intra-processor-forwarding Sometimes 1 3 > Time intra-processor-forwarding 0.00 > Hash=518e4b9b2f0770c94918ac5d7e311ba5 > > Andrea > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK