On Thu, Aug 10, 2017 at 11:13:17AM +0200, Peter Zijlstra wrote:
> On Thu, Aug 10, 2017 at 04:12:13PM +0800, Boqun Feng wrote:
> 
> > > Or is the reason this doesn't work on PPC that its RCpc?
> 
> So that :-)
> 
> > Here is an example why PPC needs a sync() before the cmpxchg():
> > 
> >     https://marc.info/?l=linux-kernel&m=144485396224519&w=2
> > 
> > and Paul Mckenney's detailed explanation about why this could happen:
> > 
> >     https://marc.info/?l=linux-kernel&m=144485909826241&w=2
> > 
> > (Somehow, I feel like he was answering to a similar question question as
> > you ask here ;-))
> 
> Yes, and I had vague memories of having gone over this before, but
> couldn't quickly find things. Thanks!
> 
> > And I think aarch64 doesn't have a problem here because it is "(other)
> > multi-copy atomic". Will?
> 
> Right, its the RCpc vs RCsc thing. The ARM64 release is as you say
> multi-copy atomic, whereas the PPC lwsync is not.
> 
> This still leaves us with the situation that we need an smp_mb() between
> smp_store_release() and a possibly failing cmpxchg() if we want to
> guarantee the cmpxchg()'s load comes after the store-release.

For whatever it is worth, this is why C11 allows specifying one
memory-order strength for the success case and another for the failure
case.  But it is not immediately clear that we need another level
of combinatorial API explosion...

                                                        Thanx, Paul

Reply via email to