On Thu, Feb 08, 2018 at 04:03:41PM +0000, Will Deacon wrote:
> On Thu, Feb 08, 2018 at 04:46:43PM +0100, Peter Zijlstra wrote:
> > On Thu, Feb 08, 2018 at 03:30:31PM +0000, Will Deacon wrote:
> > > On Thu, Feb 08, 2018 at 03:00:05PM +0100, Peter Zijlstra wrote:
> > 
> > > > Without this ordering I think it would be possible to loose has_blocked
> > > > and not observe the CPU either.
> > > 
> > > I had a quick look at this, and I think you're right. This looks very much
> > > like an 'R'-shaped test, which means it's smp_mb() all round otherwise 
> > > Power
> > > will go wrong. That also means the smp_mb__after_atomic() in
> > > nohz_balance_enter_idle *cannot* be an smp_wmb(), so you might want a
> > > comment stating that explicitly.
> > 
> > Thanks Will. BTW, where does that 'R' shape nomenclature come from?
> > This is the first I've heard of it.
> 
> I don't know where it originates from, but the imfamous "test6.pdf" has it:
> 
> https://www.cl.cam.ac.uk/~pes20/ppc-supplemental/test6.pdf
> 
> half way down the first page on the left. It says "needs sync+sync" which

Indeed.  As a curiosity: I've never _observed_ R+lwsync+sync (the lwsync
separating the two writes), and other people who tried found the same

  http://moscova.inria.fr/~maranget/cats7/linux/hard.html#unseen
  http://www.cl.cam.ac.uk/~pes20/ppc-supplemental/ppc051.html#toc8 .

It would be interesting to hear about different results ... ;-)

  Andrea


> is about as bad as it gets for Power (compare with "2+2w", which gets away
> with lwsync+lwsync). See also:
> 
> http://materials.dagstuhl.de/files/16/16471/16471.DerekWilliams.Slides.pdf
> 
> for a light-hearted, yet technically accurate story about the latter.
> 
> Will

Reply via email to