from:"Andrea Parri"

Re: [PATCH v2] tools/memory-model: Add extra ordering for locks and remove it for ordinary release/acquire

2018-07-12 Thread Andrea Parri

On Thu, Jul 12, 2018 at 11:13:48PM +0200, Andrea Parri wrote:
> On Thu, Jul 12, 2018 at 04:43:46PM -0400, Alan Stern wrote:
> > On Thu, 12 Jul 2018, Andrea Parri wrote:
> > 
> > > > It seems reasonable to ask people to learn that locks have stronger
> > > > ordering guarantees than RMW atomics do.  Maybe not the greatest
> > > > situation in the world, but one I think we could live with.
> > > 
> > > Yeah, this was one of my main objections.
> > 
> > Does this mean you don't think you could live with it?
> 
> Well, I *could* leave with it and with RCtso locks, ;-) but I'd rather not.
> 
> Assuming that I will not be able to resist this RCtso trend, ;-) would the
> below (completely untested) work?
> 
>   let rmw = rmw | lk-rmw   (* from lock.cat *)
>   let po-unlock-rf-lock-po = po ; [Release] ; rf ; [domain(rmw)] ; po

domain(rmw) & Acquire, maybe...

  Andrea


>   [the rest of your patch + the updates to the doc. I suggested in v2 ;-)]
> 
>Andrea

Re: [PATCH v2] tools/memory-model: Add extra ordering for locks and remove it for ordinary release/acquire

2018-07-12 Thread Andrea Parri

On Thu, Jul 12, 2018 at 04:43:46PM -0400, Alan Stern wrote:
> On Thu, 12 Jul 2018, Andrea Parri wrote:
> 
> > > It seems reasonable to ask people to learn that locks have stronger
> > > ordering guarantees than RMW atomics do.  Maybe not the greatest
> > > situation in the world, but one I think we could live with.
> > 
> > Yeah, this was one of my main objections.
> 
> Does this mean you don't think you could live with it?

Well, I *could* leave with it and with RCtso locks, ;-) but I'd rather not.

Assuming that I will not be able to resist this RCtso trend, ;-) would the
below (completely untested) work?

  let rmw = rmw | lk-rmw   (* from lock.cat *)
  let po-unlock-rf-lock-po = po ; [Release] ; rf ; [domain(rmw)] ; po
  [the rest of your patch + the updates to the doc. I suggested in v2 ;-)]

   Andrea

Re: [PATCH v2] tools/memory-model: Add extra ordering for locks and remove it for ordinary release/acquire

2018-07-12 Thread Andrea Parri

On Thu, Jul 12, 2018 at 09:52:42PM +0200, Andrea Parri wrote:
> On Thu, Jul 12, 2018 at 11:10:58AM -0700, Linus Torvalds wrote:
> > On Thu, Jul 12, 2018 at 11:05 AM Peter Zijlstra  
> > wrote:
> > >
> > > The locking pattern is fairly simple and shows where RCpc comes apart
> > > from expectation real nice.
> > 
> > So who does RCpc right now for the unlock-lock sequence? Somebody
> > mentioned powerpc. Anybody else?
> 
> powerpc have RCtso (and RCpc) but not RCsc unlock-lock, according to the
> following indeed original terminology:
> 
>  - RCsc unlock-lock MUST ORDER:
> 
>   a) the WRITE and the READ below:
> 
>   WRITE x=1
>   UNLOCK s
>   LOCK s
>   READ y
> 
>   as in a store-buffering test;
> 
>   b) the two WRITEs below:
> 
>   WRITE x=1
>   UNLOCK s
>   LOCK s
>   WRITE y=1
> 
>   as in a message-passing test;
> 
>   c) the two READs below:
> 
>   READ x
>   UNLOCK s
>   LOCK s
>   READ y
> 
>   as in a message-passing test;
> 
>   d) the READ and the WRITE below:
> 
>   READ x
>   UNLOCK s
>   LOCK s
>   WRITE y
> 
>   as in a load-buffering test;
> 
>  - RCtso unlock-lock MUST ORDER b), c), d) above.
> 
>  - RCpc unlock-lock MUST ORDER none of the above.
> 
> AFAICT, all arch _in_ the current implementation have RCtso unlock-lock.
> 
> 
> > 
> > How nasty would be be to make powerpc conform? I will always advocate
> > tighter locking and ordering rules over looser ones..
> 
> A simple answer is right above (place a sync somewhere in the sequence);
> for benchmark results, I must defer...

Sorry, not sure why but I did intend "conform to RCsc" here.


> 
>   Andrea
> 
> 
> > 
> >Linus

Re: [PATCH v2] tools/memory-model: Add extra ordering for locks and remove it for ordinary release/acquire

2018-07-12 Thread Andrea Parri

On Thu, Jul 12, 2018 at 11:10:58AM -0700, Linus Torvalds wrote:
> On Thu, Jul 12, 2018 at 11:05 AM Peter Zijlstra  wrote:
> >
> > The locking pattern is fairly simple and shows where RCpc comes apart
> > from expectation real nice.
> 
> So who does RCpc right now for the unlock-lock sequence? Somebody
> mentioned powerpc. Anybody else?

powerpc have RCtso (and RCpc) but not RCsc unlock-lock, according to the
following indeed original terminology:

 - RCsc unlock-lock MUST ORDER:

  a) the WRITE and the READ below:

  WRITE x=1
  UNLOCK s
  LOCK s
  READ y

  as in a store-buffering test;

  b) the two WRITEs below:

  WRITE x=1
  UNLOCK s
  LOCK s
  WRITE y=1

  as in a message-passing test;

  c) the two READs below:

  READ x
  UNLOCK s
  LOCK s
  READ y

  as in a message-passing test;

  d) the READ and the WRITE below:

  READ x
  UNLOCK s
  LOCK s
  WRITE y

  as in a load-buffering test;

 - RCtso unlock-lock MUST ORDER b), c), d) above.

 - RCpc unlock-lock MUST ORDER none of the above.

AFAICT, all arch _in_ the current implementation have RCtso unlock-lock.


> 
> How nasty would be be to make powerpc conform? I will always advocate
> tighter locking and ordering rules over looser ones..

A simple answer is right above (place a sync somewhere in the sequence);
for benchmark results, I must defer...

  Andrea


> 
>Linus

Re: [PATCH v2] tools/memory-model: Add extra ordering for locks and remove it for ordinary release/acquire

2018-07-12 Thread Andrea Parri

> It seems reasonable to ask people to learn that locks have stronger
> ordering guarantees than RMW atomics do.  Maybe not the greatest
> situation in the world, but one I think we could live with.

Yeah, this was one of my main objections.


> > Hence my proposal to strenghten rmw-acquire, because that is the basic
> > primitive used to implement lock.
> 
> That was essentially what the v2 patch did.  (And my reasoning was
> basically the same as what you have just outlined.  There was one
> additional element: smp_store_release() is already strong enough for
> TSO; the acquire is what needs to be stronger in the memory model.)

Mmh? see my comments to v2 (and your reply, in part., the part "At
least, it's not a valid general-purpose implementation".).


> > Another, and I like this proposal least, is to introduce a new barrier
> > to make this all work.
> 
> This apparently boils down to two questions:
> 
>   Should spin_lock/spin_unlock be RCsc?
> 
>   Should rmw-acquire be strong enough so that smp_store_release + 
>   rmw-acquire is RCtso?
> 
> If both answers are No, we end up with the v3 patch.  If the first
> answer is No and the second is Yes, we end up with the v2 patch.  The
> problem is that different people seem to want differing answers.

Again, maybe you're confonding v2 with v1?

  Andrea


> 
> (The implicit third question, "Should spin_lock/spin_unlock be RCtso?",
> seems to be pretty well settled at this point -- by Peter's and Will's
> vociferousness if nothing else -- despite Andrea's reservations.  
> However I admit it would be nice to have one or two examples showing
> that the kernel really needs this.)
> 
> Alan
>

Re: [PATCH v2] tools/memory-model: Add extra ordering for locks and remove it for ordinary release/acquire

2018-07-12 Thread Andrea Parri

> Anyway, back to the problem of being able to use the memory model to
> describe locks. This is I think a useful property.
> 
> My earlier reasoning was that:
> 
>   - smp_store_release() + smp_load_acquire() := RCpc
> 
>   - we use smp_store_release() as unlock()
> 
> Therefore, if we want unlock+lock to imply at least TSO (ideally
> smp_mb()) we need lock to make up for whatever unlock lacks.
> 
> Hence my proposal to strenghten rmw-acquire, because that is the basic
> primitive used to implement lock.
> 
> But as you (and Will) point out, we don't so much care about rmw-acquire
> semantics as much as that we care about unlock+lock behaviour. Another
> way to look at this is to define:
> 
>   smp-store-release + rmw-acquire := TSO (ideally smp_mb)
> 
> But then we also have to look at:
> 
>   rmw-release + smp-load-acquire
>   rmw-release + rmw-acquire
> 
> for completeness sake, and I would suggest they result in (at least) the
> same (TSO) ordering as the one we really care about.

Indeed (unless I'm not seeing something...  ;-).


> 
> One alternative is to no longer use smp_store_release() for unlock(),
> and say define atomic_set_release() to be in the rmw-release class
> instead of being a simple smp_store_release().
> 
> Another, and I like this proposal least, is to introduce a new barrier
> to make this all work.

An smp_tso__after_unlock_lock()?  (In a certain sense, the solution
adopted by RCU aligns to this approach: live with powerpc's RCpc and
introduce smp_mb__after_unlock_lock().)  Or did you have something
else in mind?

But I wouldn't hasten to introduce such a barrier, given that:  (1)
this would be a "do { } while (0)" for all the supported arch. _if_
we sticked to the current implementations, and  (2) even if these
implementations changed or some new arch. required a non-trivial
definition, we still would have to find a "pure/TSO" case  ;-).

  Andrea

Re: [PATCH v2] tools/memory-model: Add extra ordering for locks and remove it for ordinary release/acquire

2018-07-12 Thread Andrea Parri

On Thu, Jul 12, 2018 at 01:52:49PM +0200, Andrea Parri wrote:
> On Thu, Jul 12, 2018 at 09:40:40AM +0200, Peter Zijlstra wrote:
> > On Wed, Jul 11, 2018 at 02:34:21PM +0200, Andrea Parri wrote:
> > > Simplicity is the eye of the beholder.  From my POV (LKMM maintainer), the
> > > simplest solution would be to get rid of rfi-rel-acq and unlock-rf-lock-po
> > > (or its analogous in v3) all together:
> > 
> > 
> > 
> > > Among other things, this would immediately:
> > > 
> > >   1) Enable RISC-V to use their .aq/.rl annotations _without_ having to
> > >  "worry" about tso or release/acquire fences; IOW, this will permit
> > >  a partial revert of:
> > > 
> > >0123f4d76ca6 ("riscv/spinlock: Strengthen implementations with 
> > > fences")
> > >5ce6c1f3535f ("riscv/atomic: Strengthen implementations with 
> > > fences")
> > 
> > But I feel this goes in the wrong direction. This weakens the effective
> > memory model, where I feel we should strengthen it.
> > 
> > Currently PowerPC is the weakest here, and the above RISC-V changes
> > (reverts) would make RISC-V weaker still.
> > 
> > Any any effective weakening makes me very uncomfortable -- who knows
> > what will come apart this time. This memory ordering stuff causes
> > horrible subtle bugs at best.
> 
> Indeed, what I was suggesting above is a weaking of the current model
> (and I agree: I wouldn't say that bugs in this context are nice  ;-).
> 
> These changes would affect a specific area: (IMO,) the examples we've
> been considering here aren't for the faint-hearted  ;-) and as Daniel
> already suggested, everything would again be "nice and neat", if this
> was all about locking/if every thread used lock-synchronization.
> 
> 
> > 
> > >   2) Resolve the above mentioned controversy (the inconsistency between
> > >  - locking operations and atomic RMWs on one side, and their actual
> > >  implementation in generic code on the other), thus enabling the use
> > >  of LKMM _and_ its tools for the analysis/reviewing of the latter.
> > 
> > This is a good point; so lets see if there is something we can do to
> > strengthen the model so it all works again.
> > 
> > And I think if we raise atomic*_acquire() to require TSO (but ideally
> > raise it to RCsc) we're there.
> > 
> > The TSO archs have RCpc load-acquire and store-release, but fully
> > ordered atomics. Most of the other archs have smp_mb() everything, with
> > the exception of PPC, ARM64 and now RISC-V.
> > 
> > PPC has the RCpc TSO fence LWSYNC, ARM64 has the RCsc
> > load-acquire/store-release. And RISC-V has a gazillion of options IIRC.
> > 
> > 
> > So ideally atomic*_acquire() + smp_store_release() will be RCsc, and is
> > with the notable exception of PPC, and ideally RISC-V would be RCsc
> > here. But at the very least it should not be weaker than PPC.
> > 
> > By increasing atomic*_acquire() to TSO we also immediately get the
> > proposed:
> > 
> >   P0()
> >   {
> >   WRITE_ONCE(X, 1);
> >   spin_unlock(&s);
> >   spin_lock(&s);
> >   WRITE_ONCE(Y, 1);
> >   }
> > 
> >   P1()
> >   {
> >   r1 = READ_ONCE(Y);
> >   smp_rmb();
> >   r2 = READ_ONCE(X);
> >   }
> > 
> > behaviour under discussion; because the spin_lock() will imply the TSO
> > ordering.
> 
> You mean: "when paired with a po-earlier release to the same memory
> location", right?  I am afraid that neither arm64 nor riscv current
> implementations would ensure "(r1 == 1 && r2 == 0) forbidden" if we
> removed the po-earlier spin_unlock()...
> 
> AFAICT, the current implementation would work with that release: as
> you remarked above, arm64 release->acquire is SC; riscv has an rw,w
> fence in its spin_unlock() (hence an w,w fence between the stores),
> or it could have a .tso fence ...
> 
> But again, these are stuble patterns, and my guess is that several/
> most kernel developers really won't care about such guarantees (and
> if some will do, they'll have the tools to figure out what they can
> actually rely on ...)
> 
> OTOH (as I pointed out earlier) the strengthening we're configuring
> will prevent some arch. (riscv being just the example of today!) to
> go "full RCsc", and this will inevitably "complicate" both the LKMM

"full RCpc"

  Andrea


> and the reviewing process of related changes (atomics, locking, ...;
> c.f., this debate), apparently, just because you  ;-) want to "care"
> about these guarantees.
> 
> Not yet convinced ...  :/
> 
>   Andrea
> 
> 
> > 
> > And note that this retains regular RCpc ACQUIRE for smp_load_acquire()
> > and associated primitives -- as they have had since their introduction
> > not too long ago.

Re: [PATCH v2] tools/memory-model: Add extra ordering for locks and remove it for ordinary release/acquire

2018-07-12 Thread Andrea Parri

On Thu, Jul 12, 2018 at 09:40:40AM +0200, Peter Zijlstra wrote:
> On Wed, Jul 11, 2018 at 02:34:21PM +0200, Andrea Parri wrote:
> > Simplicity is the eye of the beholder.  From my POV (LKMM maintainer), the
> > simplest solution would be to get rid of rfi-rel-acq and unlock-rf-lock-po
> > (or its analogous in v3) all together:
> 
> 
> 
> > Among other things, this would immediately:
> > 
> >   1) Enable RISC-V to use their .aq/.rl annotations _without_ having to
> >  "worry" about tso or release/acquire fences; IOW, this will permit
> >  a partial revert of:
> > 
> >0123f4d76ca6 ("riscv/spinlock: Strengthen implementations with 
> > fences")
> >5ce6c1f3535f ("riscv/atomic: Strengthen implementations with fences")
> 
> But I feel this goes in the wrong direction. This weakens the effective
> memory model, where I feel we should strengthen it.
> 
> Currently PowerPC is the weakest here, and the above RISC-V changes
> (reverts) would make RISC-V weaker still.
> 
> Any any effective weakening makes me very uncomfortable -- who knows
> what will come apart this time. This memory ordering stuff causes
> horrible subtle bugs at best.

Indeed, what I was suggesting above is a weaking of the current model
(and I agree: I wouldn't say that bugs in this context are nice  ;-).

These changes would affect a specific area: (IMO,) the examples we've
been considering here aren't for the faint-hearted  ;-) and as Daniel
already suggested, everything would again be "nice and neat", if this
was all about locking/if every thread used lock-synchronization.


> 
> >   2) Resolve the above mentioned controversy (the inconsistency between
> >  - locking operations and atomic RMWs on one side, and their actual
> >  implementation in generic code on the other), thus enabling the use
> >  of LKMM _and_ its tools for the analysis/reviewing of the latter.
> 
> This is a good point; so lets see if there is something we can do to
> strengthen the model so it all works again.
> 
> And I think if we raise atomic*_acquire() to require TSO (but ideally
> raise it to RCsc) we're there.
> 
> The TSO archs have RCpc load-acquire and store-release, but fully
> ordered atomics. Most of the other archs have smp_mb() everything, with
> the exception of PPC, ARM64 and now RISC-V.
> 
> PPC has the RCpc TSO fence LWSYNC, ARM64 has the RCsc
> load-acquire/store-release. And RISC-V has a gazillion of options IIRC.
> 
> 
> So ideally atomic*_acquire() + smp_store_release() will be RCsc, and is
> with the notable exception of PPC, and ideally RISC-V would be RCsc
> here. But at the very least it should not be weaker than PPC.
> 
> By increasing atomic*_acquire() to TSO we also immediately get the
> proposed:
> 
>   P0()
>   {
> WRITE_ONCE(X, 1);
> spin_unlock(&s);
> spin_lock(&s);
> WRITE_ONCE(Y, 1);
>   }
> 
>   P1()
>   {
> r1 = READ_ONCE(Y);
> smp_rmb();
> r2 = READ_ONCE(X);
>   }
> 
> behaviour under discussion; because the spin_lock() will imply the TSO
> ordering.

You mean: "when paired with a po-earlier release to the same memory
location", right?  I am afraid that neither arm64 nor riscv current
implementations would ensure "(r1 == 1 && r2 == 0) forbidden" if we
removed the po-earlier spin_unlock()...

AFAICT, the current implementation would work with that release: as
you remarked above, arm64 release->acquire is SC; riscv has an rw,w
fence in its spin_unlock() (hence an w,w fence between the stores),
or it could have a .tso fence ...

But again, these are stuble patterns, and my guess is that several/
most kernel developers really won't care about such guarantees (and
if some will do, they'll have the tools to figure out what they can
actually rely on ...)

OTOH (as I pointed out earlier) the strengthening we're configuring
will prevent some arch. (riscv being just the example of today!) to
go "full RCsc", and this will inevitably "complicate" both the LKMM
and the reviewing process of related changes (atomics, locking, ...;
c.f., this debate), apparently, just because you  ;-) want to "care"
about these guarantees.

Not yet convinced ...  :/

  Andrea


> 
> And note that this retains regular RCpc ACQUIRE for smp_load_acquire()
> and associated primitives -- as they have had since their introduction
> not too long ago.

Re: [PATCH v2] tools/memory-model: Add extra ordering for locks and remove it for ordinary release/acquire

2018-07-12 Thread Andrea Parri

> All the discussion here[1] for example is about having ordering and
> doing an smp_cond_load_acquire() on a variable which is sometimes
> protected by a CPU's rq->lock and sometimes not?  Isn't that one of the
> key use cases for this whole discussion?

Not a "pure" one:

  
http://lkml.kernel.org/r/1530629639-27767-1-git-send-email-andrea.pa...@amarulasolutions.com

we also need "W->R ordering" in schedule()! so there better be an
smp_mb__after_spinlock() or a barrier providing similar ordering.

Nick was suggesting a "weaker version" of this barrier back in:

  362a61ad61199e ("fix SMP data race in pagetable setup vs walking")

c.f., the comment in mm/memory.c:__pte_alloc(), but that does not
math our pattern (UNLOCK+LOCK), AFAICT.

  Andrea


> 
> [1] https://lkml.org/lkml/2015/10/6/805
> 
> Dan

Re: [PATCH v2] tools/memory-model: Add extra ordering for locks and remove it for ordinary release/acquire

2018-07-11 Thread Andrea Parri

> It might be simple to model, but I worry this weakens our locking
> implementations to a point where they will not be understood by the average
> kernel developer. As I've said before, I would prefer "full" RCsc locking,
> but that's not the case with architectures we currently support today, so
> the next best thing is this "everything apart from W->R in the
> inter-thread case" ordering, which isn't going to crop up unless you're
> doing weird stuff anyway afaict.

The "average kernel developer" thinks TSO or about, right?  ;-)

  Andrea


> 
> Will

Re: [PATCH v3] tools/memory-model: Add extra ordering for locks and remove it for ordinary release/acquire

2018-07-11 Thread Andrea Parri

On Wed, Jul 11, 2018 at 08:42:11AM -0700, Paul E. McKenney wrote:
> On Wed, Jul 11, 2018 at 10:43:45AM +0100, Will Deacon wrote:
> > Hi Alan,
> > 
> > On Tue, Jul 10, 2018 at 02:18:13PM -0400, Alan Stern wrote:
> > > More than one kernel developer has expressed the opinion that the LKMM
> > > should enforce ordering of writes by locking.  In other words, given
> > > the following code:
> > > 
> > >   WRITE_ONCE(x, 1);
> > >   spin_unlock(&s):
> > >   spin_lock(&s);
> > >   WRITE_ONCE(y, 1);
> > > 
> > > the stores to x and y should be propagated in order to all other CPUs,
> > > even though those other CPUs might not access the lock s.  In terms of
> > > the memory model, this means expanding the cumul-fence relation.
> > > 
> > > Locks should also provide read-read (and read-write) ordering in a
> > > similar way.  Given:
> > > 
> > >   READ_ONCE(x);
> > >   spin_unlock(&s);
> > >   spin_lock(&s);
> > >   READ_ONCE(y);   // or WRITE_ONCE(y, 1);
> > > 
> > > the load of x should be executed before the load of (or store to) y.
> > > The LKMM already provides this ordering, but it provides it even in
> > > the case where the two accesses are separated by a release/acquire
> > > pair of fences rather than unlock/lock.  This would prevent
> > > architectures from using weakly ordered implementations of release and
> > > acquire, which seems like an unnecessary restriction.  The patch
> > > therefore removes the ordering requirement from the LKMM for that
> > > case.
> > > 
> > > All the architectures supported by the Linux kernel (including RISC-V)
> > > do provide this ordering for locks, albeit for varying reasons.
> > > Therefore this patch changes the model in accordance with the
> > > developers' wishes.
> > > 
> > > Signed-off-by: Alan Stern 
> > 
> > Thanks, I'm happy with this version of the patch:
> > 
> > Reviewed-by: Will Deacon 
> 
> I have applied your Reviewed-by, and thank you both!
> 
> Given that this is a non-trivial change and given that I am posting
> for -tip acceptance in a few days, I intend to send this one not
> to the upcoming merge window, but to the one after that.
> 
> Please let me know if there is an urgent need for this to go into the
> v4.19 merge window.

I raised some concerns in my review to v2; AFAICT, these concerns have
not been resolved: so, until then, please feel free to add my NAK. ;-)

  Andrea


> 
>   Thanx, Paul
>

Re: [PATCH v2] tools/memory-model: Add extra ordering for locks and remove it for ordinary release/acquire

2018-07-11 Thread Andrea Parri

On Wed, Jul 11, 2018 at 02:34:21PM +0200, Andrea Parri wrote:
> On Wed, Jul 11, 2018 at 10:43:11AM +0100, Will Deacon wrote:
> > On Tue, Jul 10, 2018 at 11:38:21AM +0200, Andrea Parri wrote:
> > > On Mon, Jul 09, 2018 at 04:01:57PM -0400, Alan Stern wrote:
> > > > More than one kernel developer has expressed the opinion that the LKMM
> > > > should enforce ordering of writes by locking.  In other words, given
> > > 
> > > I'd like to step back on this point: I still don't have a strong opinion
> > > on this, but all this debating made me curious about others' opinion ;-)
> > > I'd like to see the above argument expanded: what's the rationale behind
> > > that opinion? can we maybe add references to actual code relying on that
> > > ordering? other that I've been missing?
> > > 
> > > I'd extend these same questions to the "ordering of reads" snippet below
> > > (and discussed since so long...).
> > > 
> > > 
> > > > the following code:
> > > > 
> > > > WRITE_ONCE(x, 1);
> > > > spin_unlock(&s):
> > > > spin_lock(&s);
> > > > WRITE_ONCE(y, 1);
> > > > 
> > > > the stores to x and y should be propagated in order to all other CPUs,
> > > > even though those other CPUs might not access the lock s.  In terms of
> > > > the memory model, this means expanding the cumul-fence relation.
> > > > 
> > > > Locks should also provide read-read (and read-write) ordering in a
> > > > similar way.  Given:
> > > > 
> > > > READ_ONCE(x);
> > > > spin_unlock(&s);
> > > > spin_lock(&s);
> > > > READ_ONCE(y);   // or WRITE_ONCE(y, 1);
> > > > 
> > > > the load of x should be executed before the load of (or store to) y.
> > > > The LKMM already provides this ordering, but it provides it even in
> > > > the case where the two accesses are separated by a release/acquire
> > > > pair of fences rather than unlock/lock.  This would prevent
> > > > architectures from using weakly ordered implementations of release and
> > > > acquire, which seems like an unnecessary restriction.  The patch
> > > > therefore removes the ordering requirement from the LKMM for that
> > > > case.
> > > 
> > > IIUC, the same argument could be used to support the removal of the new
> > > unlock-rf-lock-po (we already discussed riscv .aq/.rl, it doesn't seem
> > > hard to imagine an arm64 LDAPR-exclusive, or the adoption of ctrl+isync
> > > on powerpc).  Why are we effectively preventing their adoption?  Again,
> > > I'd like to see more details about the underlying motivations...
> > > 
> > > 
> > > > 
> > > > All the architectures supported by the Linux kernel (including RISC-V)
> > > > do provide this ordering for locks, albeit for varying reasons.
> > > > Therefore this patch changes the model in accordance with the
> > > > developers' wishes.
> > > > 
> > > > Signed-off-by: Alan Stern 
> > > > 
> > > > ---
> > > > 
> > > > v.2: Restrict the ordering to lock operations, not general release
> > > > and acquire fences.
> > > 
> > > This is another controversial point, and one that makes me shivering ...
> > > 
> > > I have the impression that we're dismissing the suggestion "RMW-acquire
> > > at par with LKR" with a bit of rush.  So, this patch is implying that:
> > > 
> > >   while (cmpxchg_acquire(&s, 0, 1) != 0)
> > >   cpu_relax();
> > > 
> > > is _not_ a valid implementation of spin_lock()! or, at least, it is not
> > > when paired with an smp_store_release(). Will was anticipating inserting
> > > arch hooks into the (generic) qspinlock code,  when we know that similar
> > > patterns are spread all over in (q)rwlocks, mutexes, rwsem, ... (please
> > > also notice that the informal documentation is currently treating these
> > > synchronization mechanisms equally as far as "ordering" is concerned...).
> > > 
> > > This distinction between locking operations and "other acquires" appears
> > > to me not only unmotivated but also extremely _fragile (difficult to use
> > > /maintain) when considering the analysis of synchro

Re: [PATCH v2] tools/memory-model: Add extra ordering for locks and remove it for ordinary release/acquire

2018-07-11 Thread Andrea Parri

On Wed, Jul 11, 2018 at 10:43:11AM +0100, Will Deacon wrote:
> On Tue, Jul 10, 2018 at 11:38:21AM +0200, Andrea Parri wrote:
> > On Mon, Jul 09, 2018 at 04:01:57PM -0400, Alan Stern wrote:
> > > More than one kernel developer has expressed the opinion that the LKMM
> > > should enforce ordering of writes by locking.  In other words, given
> > 
> > I'd like to step back on this point: I still don't have a strong opinion
> > on this, but all this debating made me curious about others' opinion ;-)
> > I'd like to see the above argument expanded: what's the rationale behind
> > that opinion? can we maybe add references to actual code relying on that
> > ordering? other that I've been missing?
> > 
> > I'd extend these same questions to the "ordering of reads" snippet below
> > (and discussed since so long...).
> > 
> > 
> > > the following code:
> > > 
> > >   WRITE_ONCE(x, 1);
> > >   spin_unlock(&s):
> > >   spin_lock(&s);
> > >   WRITE_ONCE(y, 1);
> > > 
> > > the stores to x and y should be propagated in order to all other CPUs,
> > > even though those other CPUs might not access the lock s.  In terms of
> > > the memory model, this means expanding the cumul-fence relation.
> > > 
> > > Locks should also provide read-read (and read-write) ordering in a
> > > similar way.  Given:
> > > 
> > >   READ_ONCE(x);
> > >   spin_unlock(&s);
> > >   spin_lock(&s);
> > >   READ_ONCE(y);   // or WRITE_ONCE(y, 1);
> > > 
> > > the load of x should be executed before the load of (or store to) y.
> > > The LKMM already provides this ordering, but it provides it even in
> > > the case where the two accesses are separated by a release/acquire
> > > pair of fences rather than unlock/lock.  This would prevent
> > > architectures from using weakly ordered implementations of release and
> > > acquire, which seems like an unnecessary restriction.  The patch
> > > therefore removes the ordering requirement from the LKMM for that
> > > case.
> > 
> > IIUC, the same argument could be used to support the removal of the new
> > unlock-rf-lock-po (we already discussed riscv .aq/.rl, it doesn't seem
> > hard to imagine an arm64 LDAPR-exclusive, or the adoption of ctrl+isync
> > on powerpc).  Why are we effectively preventing their adoption?  Again,
> > I'd like to see more details about the underlying motivations...
> > 
> > 
> > > 
> > > All the architectures supported by the Linux kernel (including RISC-V)
> > > do provide this ordering for locks, albeit for varying reasons.
> > > Therefore this patch changes the model in accordance with the
> > > developers' wishes.
> > > 
> > > Signed-off-by: Alan Stern 
> > > 
> > > ---
> > > 
> > > v.2: Restrict the ordering to lock operations, not general release
> > > and acquire fences.
> > 
> > This is another controversial point, and one that makes me shivering ...
> > 
> > I have the impression that we're dismissing the suggestion "RMW-acquire
> > at par with LKR" with a bit of rush.  So, this patch is implying that:
> > 
> > while (cmpxchg_acquire(&s, 0, 1) != 0)
> > cpu_relax();
> > 
> > is _not_ a valid implementation of spin_lock()! or, at least, it is not
> > when paired with an smp_store_release(). Will was anticipating inserting
> > arch hooks into the (generic) qspinlock code,  when we know that similar
> > patterns are spread all over in (q)rwlocks, mutexes, rwsem, ... (please
> > also notice that the informal documentation is currently treating these
> > synchronization mechanisms equally as far as "ordering" is concerned...).
> > 
> > This distinction between locking operations and "other acquires" appears
> > to me not only unmotivated but also extremely _fragile (difficult to use
> > /maintain) when considering the analysis of synchronization mechanisms
> > such as those mentioned above or their porting for new arch.
> 
> The main reason for this is because developers use spinlocks all of the
> time, including in drivers. It's less common to use explicit atomics and
> extremely rare to use explicit acquire/release operations. So let's make
> locks as easy to use as possible, by giving them the strongest semantics
> that we can whilst remaining a good fit for the instructions that are
> provided by the archit

Re: [PATCH v2] tools/memory-model: Add extra ordering for locks and remove it for ordinary release/acquire

2018-07-10 Thread Andrea Parri

On Tue, Jul 10, 2018 at 01:17:50PM -0400, Alan Stern wrote:
> On Tue, 10 Jul 2018, Daniel Lustig wrote:
> 
> > > --- usb-4.x.orig/tools/memory-model/linux-kernel.cat
> > > +++ usb-4.x/tools/memory-model/linux-kernel.cat
> > > @@ -38,7 +38,7 @@ let strong-fence = mb | gp
> > >  (* Release Acquire *)
> > >  let acq-po = [Acquire] ; po ; [M]
> > >  let po-rel = [M] ; po ; [Release]
> > > -let rfi-rel-acq = [Release] ; rfi ; [Acquire]
> > > +let unlock-rf-lock-po = [UL] ; rf ; [LKR] ; po
> > 
> > It feels slightly weird that unlock-rf-lock-po is asymmetrical.  And in
> > fact, I think the current RISC-V solution we've been discussing (namely,
> > putting a fence.tso instead of a fence rw,w in front of the release)
> > may not even technically respect that particular sequence.  The
> > fence.tso solution really enforces "po; [UL]; rf; [LKR]", right?
> > 
> > Does something like "po; [UL]; rf; [LKR]; po" fit in with the rest
> > of the model?  If so, maybe that solves the asymmetry and also
> > legalizes the approach of putting fence.tso in front?
> 
> That would work just as well.  For this version of the patch it 
> doesn't make any difference, because nothing that comes po-after the 
> LKR is able to directly read the value stored by the UL.

Consider:

C v2-versus-v3

{}

P0(spinlock_t *s, int *x)
{
spin_lock(s);   /* A */
spin_unlock(s);
spin_lock(s);
WRITE_ONCE(*x, 1); /* B */
spin_unlock(s);
}

P1(spinlock_t *s, int *x)
{
int r0;
int r1;

r0 = READ_ONCE(*x); /* C */
smp_rmb();
r1 = spin_is_locked(s); /* D */
}

With v3, it's allowed that C reads from B and D reads from (the LKW of) A;
this is not allowed with v2 (unless I mis-applied/mis-read v2).

  Andrea

Re: [PATCH v2] tools/memory-model: Add extra ordering for locks and remove it for ordinary release/acquire

2018-07-10 Thread Andrea Parri

On Tue, Jul 10, 2018 at 11:34:45AM -0400, Alan Stern wrote:
> On Tue, 10 Jul 2018, Andrea Parri wrote:
> 
> > > >   ACQUIRE operations include LOCK operations and both smp_load_acquire()
> > > >   and smp_cond_acquire() operations.  [BTW, the latter was replaced by
> > > >   smp_cond_load_acquire() in 1f03e8d2919270 ...]
> > > > 
> > > >   RELEASE operations include UNLOCK operations and smp_store_release()
> > > >   operations. [...]
> > > > 
> > > >   [...] after an ACQUIRE on a given variable, all memory accesses
> > > >   preceding any prior RELEASE on that same variable are guaranteed
> > > >   to be visible.
> > > 
> > > As far as I can see, these statements remain valid.
> > 
> > Interesting; ;-)  What does these statement tells you ;-)  when applied
> > to a: and b: below?
> > 
> >   a: WRITE_ONCE(x, 1); // "preceding any prior RELEASE..."
> >   smp_store_release(&s, 1);
> >   smp_load_acquire(&s);
> >   b: WRITE_ONCE(y, 1); // "after an ACQUIRE..."
> 
> The first statement tells me that b: follows an ACQUIRE.
> 
> The second tells me that a: precedes a RELEASE.
> 
> And the third tells me that any READ_ONCE(x) statements coming po-after 
> b: would see x = 1 or a later value of x.  (Of course, they would have 
> to see that anyway because of the cache coherency rules.)

Mmh, something like "visible from the same CPU of the ACQUIRE" probably
could have helped me to reach the same conclusion.


> 
> More to the point, given:
> 
> P0()
> {
>   WRITE_ONCE(x, 1);
>   a: smp_store_release(&s, 1);
> }
> 
> P1()
> {
>   b: r1 = smp_load_acquire(&s);
>   r2 = READ_ONCE(x);
> }
> 
> the third statement tells me that if r1 = 1 (that is, if a: is prior to
> b:) then r2 must be 1.

Indeed; the "prior" is ambiguous, but yes.

  Andrea


> 
> Alan
>

Re: [PATCH v2] tools/memory-model: Add extra ordering for locks and remove it for ordinary release/acquire

2018-07-10 Thread Andrea Parri

> >   ACQUIRE operations include LOCK operations and both smp_load_acquire()
> >   and smp_cond_acquire() operations.  [BTW, the latter was replaced by
> >   smp_cond_load_acquire() in 1f03e8d2919270 ...]
> > 
> >   RELEASE operations include UNLOCK operations and smp_store_release()
> >   operations. [...]
> > 
> >   [...] after an ACQUIRE on a given variable, all memory accesses
> >   preceding any prior RELEASE on that same variable are guaranteed
> >   to be visible.
> 
> As far as I can see, these statements remain valid.

Interesting; ;-)  What does these statement tells you ;-)  when applied
to a: and b: below?

  a: WRITE_ONCE(x, 1); // "preceding any prior RELEASE..."
  smp_store_release(&s, 1);
  smp_load_acquire(&s);
  b: WRITE_ONCE(y, 1); // "after an ACQUIRE..."

  Andrea

[PATCH] doc: Replace smp_cond_acquire() with smp_cond_load_acquire()

2018-07-10 Thread Andrea Parri

Amend commit 1f03e8d2919270 ("locking/barriers: Replace smp_cond_acquire()
with smp_cond_load_acquire()") by updating the documentation accordingly.

Signed-off-by: Andrea Parri 
Cc: Alan Stern 
Cc: Will Deacon 
Cc: Peter Zijlstra 
Cc: Boqun Feng 
Cc: Nicholas Piggin 
Cc: David Howells 
Cc: Jade Alglave 
Cc: Luc Maranget 
Cc: "Paul E. McKenney" 
Cc: Akira Yokosawa 
Cc: Daniel Lustig 
Cc: Jonathan Corbet 
---
 Documentation/memory-barriers.txt | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/memory-barriers.txt 
b/Documentation/memory-barriers.txt
index 0d8d7ef131e9a..987a4e6cc0cd8 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -471,8 +471,8 @@ And a couple of implicit varieties:
  operations after the ACQUIRE operation will appear to happen after the
  ACQUIRE operation with respect to the other components of the system.
  ACQUIRE operations include LOCK operations and both smp_load_acquire()
- and smp_cond_acquire() operations. The later builds the necessary ACQUIRE
- semantics from relying on a control dependency and smp_rmb().
+ and smp_cond_load_acquire() operations. The later builds the necessary
+ ACQUIRE semantics from relying on a control dependency and smp_rmb().
 
  Memory operations that occur before an ACQUIRE operation may appear to
  happen after it completes.
-- 
2.7.4

Re: [PATCH v2] tools/memory-model: Add extra ordering for locks and remove it for ordinary release/acquire

2018-07-10 Thread Andrea Parri

On Mon, Jul 09, 2018 at 04:01:57PM -0400, Alan Stern wrote:
> More than one kernel developer has expressed the opinion that the LKMM
> should enforce ordering of writes by locking.  In other words, given

I'd like to step back on this point: I still don't have a strong opinion
on this, but all this debating made me curious about others' opinion ;-)
I'd like to see the above argument expanded: what's the rationale behind
that opinion? can we maybe add references to actual code relying on that
ordering? other that I've been missing?

I'd extend these same questions to the "ordering of reads" snippet below
(and discussed since so long...).

> the following code:
> 
>   WRITE_ONCE(x, 1);
>   spin_unlock(&s):
>   spin_lock(&s);
>   WRITE_ONCE(y, 1);
> 
> the stores to x and y should be propagated in order to all other CPUs,
> even though those other CPUs might not access the lock s.  In terms of
> the memory model, this means expanding the cumul-fence relation.
> 
> Locks should also provide read-read (and read-write) ordering in a
> similar way.  Given:
> 
>   READ_ONCE(x);
>   spin_unlock(&s);
>   spin_lock(&s);
>   READ_ONCE(y);   // or WRITE_ONCE(y, 1);
> 
> the load of x should be executed before the load of (or store to) y.
> The LKMM already provides this ordering, but it provides it even in
> the case where the two accesses are separated by a release/acquire
> pair of fences rather than unlock/lock.  This would prevent
> architectures from using weakly ordered implementations of release and
> acquire, which seems like an unnecessary restriction.  The patch
> therefore removes the ordering requirement from the LKMM for that
> case.

IIUC, the same argument could be used to support the removal of the new
unlock-rf-lock-po (we already discussed riscv .aq/.rl, it doesn't seem
hard to imagine an arm64 LDAPR-exclusive, or the adoption of ctrl+isync
on powerpc).  Why are we effectively preventing their adoption?  Again,
I'd like to see more details about the underlying motivations...

> 
> All the architectures supported by the Linux kernel (including RISC-V)
> do provide this ordering for locks, albeit for varying reasons.
> Therefore this patch changes the model in accordance with the
> developers' wishes.
> 
> Signed-off-by: Alan Stern 
> 
> ---
> 
> v.2: Restrict the ordering to lock operations, not general release
> and acquire fences.

This is another controversial point, and one that makes me shivering ...

I have the impression that we're dismissing the suggestion "RMW-acquire
at par with LKR" with a bit of rush.  So, this patch is implying that:

while (cmpxchg_acquire(&s, 0, 1) != 0)
cpu_relax();

is _not_ a valid implementation of spin_lock()! or, at least, it is not
when paired with an smp_store_release(). Will was anticipating inserting
arch hooks into the (generic) qspinlock code,  when we know that similar
patterns are spread all over in (q)rwlocks, mutexes, rwsem, ... (please
also notice that the informal documentation is currently treating these
synchronization mechanisms equally as far as "ordering" is concerned...).

This distinction between locking operations and "other acquires" appears
to me not only unmotivated but also extremely _fragile (difficult to use
/maintain) when considering the analysis of synchronization mechanisms
such as those mentioned above or their porting for new arch.

Please see below for a couple of minor comments.

> 
> [as1871b]
> 
> 
>  tools/memory-model/Documentation/explanation.txt   | 
>  186 +++---
>  tools/memory-model/linux-kernel.cat| 
>8 
>  tools/memory-model/litmus-tests/ISA2+pooncelock+pooncelock+pombonce.litmus | 
>5 
>  3 files changed, 149 insertions(+), 50 deletions(-)
> 
> Index: usb-4.x/tools/memory-model/linux-kernel.cat
> ===
> --- usb-4.x.orig/tools/memory-model/linux-kernel.cat
> +++ usb-4.x/tools/memory-model/linux-kernel.cat
> @@ -38,7 +38,7 @@ let strong-fence = mb | gp
>  (* Release Acquire *)
>  let acq-po = [Acquire] ; po ; [M]
>  let po-rel = [M] ; po ; [Release]
> -let rfi-rel-acq = [Release] ; rfi ; [Acquire]
> +let unlock-rf-lock-po = [UL] ; rf ; [LKR] ; po
>  
>  (**)
>  (* Fundamental coherence ordering *)
> @@ -60,13 +60,13 @@ let dep = addr | data
>  let rwdep = (dep | ctrl) ; [W]
>  let overwrite = co | fr
>  let to-w = rwdep | (overwrite & int)
> -let to-r = addr | (dep ; rfi) | rfi-rel-acq
> +let to-r = addr | (dep ; rfi)
>  let fence = strong-fence | wmb | po-rel | rmb | acq-po
> -let ppo = to-r | to-w | fence
> +let ppo = to-r | to-w | fence | (unlock-rf-lock-po & int)
>  
>  (* Propagation: Ordering from release operations and strong fences. *)
>  let A-cumul(r) = rfe? ; r
> -let cumul-fence = A-cumul(strong-fence | po-rel) | wmb
> +let cumul-fence = A-cumul(strong-fence | po-rel) | wmb |

Re: [PATCH V2 11/19] csky: Atomic operations

2018-07-07 Thread Andrea Parri

On Sat, Jul 07, 2018 at 04:08:47PM +0800, Guo Ren wrote:
> On Fri, Jul 06, 2018 at 02:17:16PM +0200, Peter Zijlstra wrote:

>   CPU0CPU1
> 
>   WRITE_ONCE(x, 1)WRITE_ONCE(y, 1)
>   r0 = xchg(&y, 2)r1 = xchg(&x, 2)
> 
> must not allow: r0==0 && r1==0
> So we must add a smp_mb between WRITE_ONCE() and xchg(), right?

The state (r0==0 && r1==0) _must_ not be allowed in the above snippet (so,
even without the additional smp_mb() between WRITE_ONCE() and xchg()).  In
informal terms, xchg() provides the smp_mb().

Compare implementations of xchg() and xchg_relaxed().  The following could
also be helpful (in addition to the references pointed out earlier):

  Documentation/atomic_t.txt

  Andrea

> 
>  Guo Ren
>

Re: [PATCH V2 11/19] csky: Atomic operations

2018-07-07 Thread Andrea Parri

Hi Guo,

On Sat, Jul 07, 2018 at 03:42:10PM +0800, Guo Ren wrote:
> On Fri, Jul 06, 2018 at 01:56:14PM +0200, Peter Zijlstra wrote:
> > CPU0CPU1
> > 
> > r1 = READ_ONCE(x);  WRITE_ONCE(y, 1);
> > r2 = xchg(&y, 2);   smp_store_release(&x, 1);
> > 
> > must not allow: r1==1 && r2==0
> CPU1 smp_store_release could be finished before WRITE_ONCE, so r1=1 &&
> r2=0?

The emphasis is on the "must": your implementation __must__ prevent this
from happening (say, by inserting memory barriers in smp_store_release());
if your implementation allows the state (r1==1 && r2==0), then the imple-
mentation is incorrect.

I'd suggest you have a look at the Linux-kernel memory consistency model
documentation and the associated tools, starting with:

  Documentation/memory-barriers.txt
  tools/memory-model/

(and please do not hesitate to ask questions about them, if something is
 unclear).

  Andrea

Re: [PATCH 2/2] tools/memory-model: Add write ordering by release-acquire and by locks

2018-07-05 Thread Andrea Parri

On Thu, Jul 05, 2018 at 08:38:36PM +0200, Andrea Parri wrote:
> > No, I'm definitely not pushing for anything stronger.  I'm still just
> > wondering if the name "RCsc" is right for what you described.  For
> > example, Andrea just said this in a parallel email:
> > 
> > > "RCsc" as ordering everything except for W -> R, without the [extra]
> > > barriers
> 
> And I already regret it: the point is, different communities/people have
> different things in mind when they use terms such as "RCsc" or "ordering"
> and different communities seems to be represented in LKMM.
> 
> Really, I don't think that this is simply a matter of naming (personally,
> I'd be OK with "foo" or whather you suggested below! ;-)). My suggestion
> would be: "get in there!! ;-) please let's refrain from using terms such
> as these (_overly_ overloaded) "RCsc" and "order" when talking about MCM
> let's rather talk, say, about "ppo", "cumul-fence" ...

... or bare litmus tests!

  Andrea


> 
>   Andrea
> 
> 
> > 
> > If it's "RCsc with exceptions", doesn't it make sense to find a
> > different name, rather than simply overloading the term "RCsc" with
> > a subtly different meaning, and hoping nobody gets confused?
> > 
> > I suppose on x86 and ARM you'd happen to get "true RCsc" anyway, just
> > due to the way things are currently mapped: LOCKed RMWs and "true RCsc"
> > instructions, respectively.  But on Power and RISC-V, it would really
> > be more "RCsc with a W->R exception", right?
> > 
> > In fact, the more I think about it, this doesn't seem to be RCsc at all.
> > It seems closer to "RCpc plus extra PC ordering between critical
> > sections".  No?
> > 
> > The synchronization accesses themselves aren't sequentially consistent
> > with respect to each other under the Power or RISC-V mappings, unless
> > there's a hwsync in there somewhere that I missed?  Or a rule
> > preventing stw from forwarding to lwarx?  Or some other higher-order
> > effect preventing it from being observed anyway?
> > 
> > So that's all I'm suggesting here.  If you all buy that, maybe "RCpccs"
> > for "RCpc with processor consistent critical section ordering"?
> > I don't have a strong opinion on the name itself; I just want to find
> > a name that's less ambiguous or overloaded.
> > 
> > Dan

Re: [PATCH 2/2] tools/memory-model: Add write ordering by release-acquire and by locks

2018-07-05 Thread Andrea Parri

> No, I'm definitely not pushing for anything stronger.  I'm still just
> wondering if the name "RCsc" is right for what you described.  For
> example, Andrea just said this in a parallel email:
> 
> > "RCsc" as ordering everything except for W -> R, without the [extra]
> > barriers

And I already regret it: the point is, different communities/people have
different things in mind when they use terms such as "RCsc" or "ordering"
and different communities seems to be represented in LKMM.

Really, I don't think that this is simply a matter of naming (personally,
I'd be OK with "foo" or whather you suggested below! ;-)). My suggestion
would be: "get in there!! ;-) please let's refrain from using terms such
as these (_overly_ overloaded) "RCsc" and "order" when talking about MCM
let's rather talk, say, about "ppo", "cumul-fence" ...

  Andrea


> 
> If it's "RCsc with exceptions", doesn't it make sense to find a
> different name, rather than simply overloading the term "RCsc" with
> a subtly different meaning, and hoping nobody gets confused?
> 
> I suppose on x86 and ARM you'd happen to get "true RCsc" anyway, just
> due to the way things are currently mapped: LOCKed RMWs and "true RCsc"
> instructions, respectively.  But on Power and RISC-V, it would really
> be more "RCsc with a W->R exception", right?
> 
> In fact, the more I think about it, this doesn't seem to be RCsc at all.
> It seems closer to "RCpc plus extra PC ordering between critical
> sections".  No?
> 
> The synchronization accesses themselves aren't sequentially consistent
> with respect to each other under the Power or RISC-V mappings, unless
> there's a hwsync in there somewhere that I missed?  Or a rule
> preventing stw from forwarding to lwarx?  Or some other higher-order
> effect preventing it from being observed anyway?
> 
> So that's all I'm suggesting here.  If you all buy that, maybe "RCpccs"
> for "RCpc with processor consistent critical section ordering"?
> I don't have a strong opinion on the name itself; I just want to find
> a name that's less ambiguous or overloaded.
> 
> Dan

Re: [PATCH 2/2] tools/memory-model: Add write ordering by release-acquire and by locks

2018-07-05 Thread Andrea Parri

On Thu, Jul 05, 2018 at 09:58:48AM -0700, Paul E. McKenney wrote:
> On Thu, Jul 05, 2018 at 05:39:06PM +0200, Andrea Parri wrote:
> > > > At any rate, it looks like instead of strengthening the relation, I
> > > > should write a patch that removes it entirely.  I also will add new,
> > > > stronger relations for use with locking, essentially making spin_lock
> > > > and spin_unlock be RCsc.
> > > 
> > > Only in the presence of smp_mb__after_unlock_lock() or
> > > smp_mb__after_spinlock(), correct?  Or am I confused about RCsc?
> > 
> > There are at least two definitions of RCsc: one as documented in the header
> > comment for smp_mb__after_spinlock() or rather in the patch under review...,
> > one as processor architects used to intend it. ;-)
> 
> Searching isn't working for me all that well this morning, so could you
> please send me a pointer to that patch?

Sorry, I meant in _this patch_: "RCsc" as ordering everything except for
W -> R, without the barriers above (_informally, the current LKMM misses
the W -> W order only).

  Andrea

> 
>   Thanx, Paul
>

Re: [PATCH 2/2] tools/memory-model: Add write ordering by release-acquire and by locks

2018-07-05 Thread Andrea Parri

> > At any rate, it looks like instead of strengthening the relation, I
> > should write a patch that removes it entirely.  I also will add new,
> > stronger relations for use with locking, essentially making spin_lock
> > and spin_unlock be RCsc.
> 
> Only in the presence of smp_mb__after_unlock_lock() or
> smp_mb__after_spinlock(), correct?  Or am I confused about RCsc?

There are at least two definitions of RCsc: one as documented in the header
comment for smp_mb__after_spinlock() or rather in the patch under review...,
one as processor architects used to intend it. ;-)

  Andrea
>   Thanx, Paul
>

Re: [PATCH 2/2] tools/memory-model: Add write ordering by release-acquire and by locks

2018-07-05 Thread Andrea Parri

> At any rate, it looks like instead of strengthening the relation, I
> should write a patch that removes it entirely.  I also will add new,
> stronger relations for use with locking, essentially making spin_lock
> and spin_unlock be RCsc.

Thank you.

Ah let me put this forward: please keep an eye on the (generic)

  queued_spin_lock()
  queued_spin_unlock()

(just to point out an example). Their implementation (in part.,
the fast-path) suggests that if we will stick to RCsc lock then
we should also stick to RCsc acq. load from RMW and rel. store.

  Andrea


> 
> Alan
>

Re: [PATCH 2/2] tools/memory-model: Add write ordering by release-acquire and by locks

2018-07-05 Thread Andrea Parri

On Wed, Jul 04, 2018 at 01:11:04PM +0100, Will Deacon wrote:
> Hi Alan,
> 
> On Tue, Jul 03, 2018 at 01:28:17PM -0400, Alan Stern wrote:
> > On Mon, 25 Jun 2018, Andrea Parri wrote:
> > 
> > > On Fri, Jun 22, 2018 at 07:30:08PM +0100, Will Deacon wrote:
> > > > > > I think the second example would preclude us using LDAPR for 
> > > > > > load-acquire,
> > > 
> > > > I don't think it's a moot point. We want new architectures to implement
> > > > acquire/release efficiently, and it's not unlikely that they will have
> > > > acquire loads that are similar in semantics to LDAPR. This patch 
> > > > prevents
> > > > them from doing so,
> > > 
> > > By this same argument, you should not be a "big fan" of rfi-rel-acq in 
> > > ppo ;)
> > > consider, e.g., the two litmus tests below: what am I missing?
> > 
> > This is an excellent point, which seems to have gotten lost in the 
> > shuffle.  I'd like to see your comments.
> 
> Yeah, sorry. Loads going on at the moment. You could ask herd instead of me
> though ;)
> 
> > In essence, if you're using release-acquire instructions that only
> > provide RCpc consistency, does store-release followed by load-acquire
> > of the same address provide read-read ordering?  In theory it doesn't
> > have to, because if the value from the store-release is forwarded to
> > the load-acquire then:
> > 
> > LOAD A
> > STORE-RELEASE X, v
> > LOAD-ACQUIRE X
> > LOAD B
> > 
> > could be executed by the CPU in the order:
> > 
> > LOAD-ACQUIRE X
> > LOAD B
> > LOAD A
> > STORE-RELEASE X, v
> > 
> > thereby accessing A and B out of program order without violating the
> > requirements on the release or the acquire.
> > 
> > Of course PPC doesn't allow this, but should we rule it out entirely?
> 
> This would be allowed if LOAD-ACQUIRE was implemented using LDAPR on Arm.
> I don't think we should be ruling out architectures using RCpc
> acquire/release primitives, because doing so just feels like an artifact of
> most architectures building these out of fences today.
> 
> It's funny really, because from an Arm-perspective I don't plan to stray
> outside of RCsc, but I feel like other weak architectures aren't being
> well represented here. If we just care about x86, Arm and Power (and assume
> that Power doesn't plan to implement RCpc acquire/release instructions)
> then we're good to tighten things up. But I fear that RISC-V should probably
> be more engaged (adding Daniel) and who knows about MIPS or these other
> random architectures popping up on linux-arch.
> 
> > > C MP+fencewmbonceonce+pooncerelease-rfireleaseacquire-poacquireonce
> > > 
> > > {}
> > > 
> > > P0(int *x, int *y)
> > > {
> > >   WRITE_ONCE(*x, 1);
> > >   smp_wmb();
> > >   WRITE_ONCE(*y, 1);
> > > }
> > > 
> > > P1(int *x, int *y, int *z)
> > > {
> > >   r0 = READ_ONCE(*y);
> > >   smp_store_release(z, 1);
> > >   r1 = smp_load_acquire(z);
> > >   r2 = READ_ONCE(*x);
> > > }
> > > 
> > > exists (1:r0=1 /\ 1:r1=1 /\ 1:r2=0)
> > > 
> > > 
> > > AArch64 MP+dmb.st+popl-rfilq-poqp
> > > "DMB.STdWW Rfe PodRWPL RfiLQ PodRRQP Fre"
> > > Generator=diyone7 (version 7.49+02(dev))
> > > Prefetch=0:x=F,0:y=W,1:y=F,1:x=T
> > > Com=Rf Fr
> > > Orig=DMB.STdWW Rfe PodRWPL RfiLQ PodRRQP Fre
> > > {
> > > 0:X1=x; 0:X3=y;
> > > 1:X1=y; 1:X3=z; 1:X6=x;
> > > }
> > >  P0  | P1;
> > >  MOV W0,#1   | LDR W0,[X1]   ;
> > >  STR W0,[X1] | MOV W2,#1 ;
> > >  DMB ST  | STLR W2,[X3]  ;
> > >  MOV W2,#1   | LDAPR W4,[X3] ;
> > >  STR W2,[X3] | LDR W5,[X6]   ;
> > > exists
> > > (1:X0=1 /\ 1:X4=1 /\ 1:X5=0)
> 
> (you can also run this yourself, since 'Q' is supported in the .cat file
> I contributed to herdtools7)
> 
> Test MP+dmb.sy+popl-rfilq-poqp Allowed
> States 4
> 1:X0=0; 1:X4=1; 1:X5=0;
> 1:X0=0; 1:X4=1; 1:X5=1;
> 1:X0=1; 1:X4=1; 1:X5=0;
> 1:X0=1; 1:X4=1; 1:X5=1;
> Ok
> Witnesses
> Positive: 1 Negative: 3
> Condition exists (1:X0=1 /\ 1:X4=1 /\ 1:X5=0)
> Observation MP+dmb.sy+popl-rfilq-poqp Sometimes 1 3
> Time MP+dmb.sy+popl-rfilq-poqp 0.01
> Hash=61858b7b59a6310d869f99cd05718f96
> 
> > There's also re

Re: [PATCH 0/2] tools/memory-model: remove ACCESS_ONCE()

2018-07-03 Thread Andrea Parri

> commit 33a58ee5eadadfb1f4850eabd4fac332984881d5
> Author: Paul E. McKenney 
> Date:   Tue Jul 3 08:48:09 2018 -0700
> 
> tools/memory-model: Add informal LKMM documentation to MAINTAINERS
> 
> The Linux-kernel memory model has been informal, with a number of
> text files documenting it.  It would be good to make sure that these
> informal descriptions are kept up to date and/or pruned appropriately.
> This commit therefore brings more of those text files into the LKMM
> MAINTAINERS file entry.
> 
> Signed-off-by: Paul E. McKenney 
> Cc: Alan Stern 
> Cc: Andrea Parri 
> Cc: Will Deacon 
> Cc: Peter Zijlstra 
> Cc: Boqun Feng 
> Cc: Nicholas Piggin 
> Cc: David Howells 
> Cc: Jade Alglave 
> Cc: Luc Maranget 
> Cc: Akira Yokosawa 
> Cc: Daniel Lustig 
> Cc: "David S. Miller" 

With the disclaimer that I'm not (yet) familiar with reST,

Acked-by: Andrea Parri 

Adding the linux-arch ML, as you suggested, would also make sense to me.

Thanks,
  Andrea


> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index f2903b818671..2ba947fc9a2f 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -8321,6 +8321,10 @@ L: linux-kernel@vger.kernel.org
>  S:   Supported
>  T:   git git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
>  F:   tools/memory-model/
> +F:   Documentation/atomic_bitops.txt
> +F:   Documentation/atomic_t.txt
> +F:   Documentation/core-api/atomic_ops.rst
> +F:   Documentation/core-api/refcount-vs-atomic.rst
>  F:   Documentation/memory-barriers.txt
>  
>  LINUX SECURITY MODULE (LSM) FRAMEWORK
>

Re: [PATCH v3 2/3] locking: Clarify requirements for smp_mb__after_spinlock()

2018-07-03 Thread Andrea Parri

On Tue, Jul 03, 2018 at 08:39:10AM -0700, Paul E. McKenney wrote:

[...]

> > + * smp_mb__after_spinlock() provides the equivalent of a full memory 
> > barrier
> > + * between program-order earlier lock acquisitions and program-order later
> 
> Not just the earlier lock acquisition, but also all program-order earlier
> memory accesses, correct?

I understand: "but also all program-order earlier memory accesses program-order
before that lock acquisition(s) ...".  Yes, but:

  - I considered this as implied by the above (L ->mb M2 and M1 ->po L implies
M1 ->mb M2, where M1, M2 are memory accesses and L is a lock acquisition);

  - my prose abilities are limited ;-), and I was/am unable to come up with an
(to me) acceptable or readable enough way to make it explicit; some ideas?


> > + *   WRITE_ONCE(X, 1); WRITE_ONCE(Y, 1);
> > + *   spin_lock(S); smp_mb();
> > + *   smp_mb__after_spinlock(); r1 = READ_ONCE(X);
> > + *   r0 = READ_ONCE(Y);
> > + *   spin_unlock(S);
> 
> Should we say that this is an instance of the SB pattern?  (Am OK either
> way, just asking the question.)

I don't think we *should* ;-),  but I'm also OK either way.

  Andrea

Re: [PATCH] refcount: always allow checked forms

2018-07-03 Thread Andrea Parri

Hi Mark,

a typo below:


>  /**
> - * refcount_inc - increment a refcount
> + * refcount_inc_checked - increment a refcount
>   * @r: the refcount to increment
>   *
>   * Similar to atomic_inc(), but will saturate at UINT_MAX and WARN.
> @@ -148,14 +146,14 @@ EXPORT_SYMBOL(refcount_inc_not_zero);
>   * Will WARN if the refcount is 0, as this represents a possible 
> use-after-free
>   * condition.
>   */
> -void refcount_inc(refcount_t *r)
> +void refcount_inc_chcked(refcount_t *r)

s/chcked/checked

  Andrea

Re: [PATCH 0/2] tools/memory-model: remove ACCESS_ONCE()

2018-07-03 Thread Andrea Parri

> >   1) Merge the file touched by that patch into (the recently created):
> >   
> > Documentation/atomic_t.txt
> > 
> >  (FWIW, queued in my TODO list).
> 
> Some consolidation of documentation would be good.  ;-)
> 
> Thoughts from others?
> 
> >   2) Add the entry:
> > 
> > F: Documentation/atomic_t.txt
> > 
> >  to the "ATOMIC INFRASTRUCTURE" subsystem in the MAINTAINERS file so
> >  that developers can easily find (the intended?) reviewers for their
> >  patch. (Of course, this will need ACK from the ATOMIC people).
> 
> If the merging will take awhile, it might also be good to put
> Documentation/core-api/atomic_ops.rst somewhere as well.

Indeed.  And let's not forget the "orphaned":

  Documentation/atomic_bitops.txt
  Documentation/core-api/refcount-vs-atomic.rst

;-)

  Andrea

[PATCH v2 2/3] locking: Clarify requirements for smp_mb__after_spinlock()

2018-07-02 Thread Andrea Parri

There are 11 interpretations of the requirements described in the header
comment for smp_mb__after_spinlock(): one for each LKMM maintainer, and
one currently encoded in the Cat file. Stick to the latter (until a more
satisfactory solution is available).

This also reworks some snippets related to the barrier to illustrate the
requirements and to link them to the idioms which are relied upon at its
call sites.

Suggested-by: Boqun Feng 
Signed-off-by: Andrea Parri 
Cc: Peter Zijlstra 
Cc: Ingo Molnar 
Cc: Will Deacon 
Cc: "Paul E. McKenney" 
---
Changes since v1:

  - reworked the snippets (Peter Zijlstra)
  - style fixes (Alan Stern and Matthew Wilcox)
  - added Boqun's Suggested-by: tag

 include/linux/spinlock.h | 51 
 kernel/sched/core.c  | 41 +++---
 2 files changed, 55 insertions(+), 37 deletions(-)

diff --git a/include/linux/spinlock.h b/include/linux/spinlock.h
index 1e8a464358384..0b46efca659f9 100644
--- a/include/linux/spinlock.h
+++ b/include/linux/spinlock.h
@@ -114,29 +114,46 @@ do {  
\
 #endif /*arch_spin_is_contended*/
 
 /*
- * This barrier must provide two things:
+ * smp_mb__after_spinlock() provides the equivalent of a full memory barrier
+ * between program-order earlier lock acquisitions and program-order later
+ * memory accesses.
  *
- *   - it must guarantee a STORE before the spin_lock() is ordered against a
- * LOAD after it, see the comments at its two usage sites.
+ * This guarantees that the following two properties hold:
  *
- *   - it must ensure the critical section is RCsc.
+ *   1) Given the snippet:
  *
- * The latter is important for cases where we observe values written by other
- * CPUs in spin-loops, without barriers, while being subject to scheduling.
+ *   { X = 0;  Y = 0; }
  *
- * CPU0CPU1CPU2
+ *   CPU0  CPU1
  *
- * for (;;) {
- *   if (READ_ONCE(X))
- * break;
- * }
- * X=1
- * 
- * 
- * r = X;
+ *   WRITE_ONCE(X, 1); WRITE_ONCE(Y, 1);
+ *   spin_lock(S); smp_mb();
+ *   smp_mb__after_spinlock(); r1 = READ_ONCE(X);
+ *   r0 = READ_ONCE(Y);
+ *   spin_unlock(S);
  *
- * without transitivity it could be that CPU1 observes X!=0 breaks the loop,
- * we get migrated and CPU2 sees X==0.
+ *  it is forbidden that CPU0 does not observe CPU1's store to Y (r0 = 0)
+ *  and CPU1 does not observe CPU0's store to X (r1 = 0); see the comments
+ *  preceding the call to smp_mb__after_spinlock() in __schedule() and in
+ *  try_to_wake_up().
+ *
+ *   2) Given the snippet:
+ *
+ *  { X = 0;  Y = 0; }
+ *
+ *  CPU0   CPU1CPU2
+ *
+ *  spin_lock(S);  spin_lock(S);   r1 = READ_ONCE(Y);
+ *  WRITE_ONCE(X, 1);  smp_mb__after_spinlock();   smp_rmb();
+ *  spin_unlock(S);r0 = READ_ONCE(X);  r2 = READ_ONCE(X);
+ * WRITE_ONCE(Y, 1);
+ * spin_unlock(S);
+ *
+ *  it is forbidden that CPU0's critical section executes before CPU1's
+ *  critical section (r0 = 1), CPU2 observes CPU1's store to Y (r1 = 1)
+ *  and CPU2 does not observe CPU0's store to X (r2 = 0); see the comments
+ *  preceding the calls to smp_rmb() in try_to_wake_up() for similar
+ *  snippets but "projected" onto two CPUs.
  *
  * Since most load-store architectures implement ACQUIRE with an smp_mb() after
  * the LL/SC loop, they need no further barriers. Similarly all our TSO
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index da8f12119a127..ec9ef0aec71ac 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1999,21 +1999,20 @@ try_to_wake_up(struct task_struct *p, unsigned int 
state, int wake_flags)
 * be possible to, falsely, observe p->on_rq == 0 and get stuck
 * in smp_cond_load_acquire() below.
 *
-* sched_ttwu_pending() try_to_wake_up()
-*   [S] p->on_rq = 1;  [L] P->state
-*   UNLOCK rq->lock  -.
-*  \
-*   +---   RMB
-* schedule()   /
-*   LOCK rq->lock-'
-*   UNLOCK rq->lock
+* sched_ttwu_pending() try_to_wake_up()
+*   STORE p->on_rq = 1   LOAD p->state
+*   UNLOCK rq->lock
+*
+* __schedule() (switch to task 'p')
+*   LOCK rq->locksmp_rmb();
+*   smp_

Re: [PATCH 0/2] tools/memory-model: remove ACCESS_ONCE()

2018-06-28 Thread Andrea Parri

> 1bc179880fba docs: atomic_ops: Describe atomic_set as a write operation
> 
>   The above patches need at least one additional Acked-by
>   or Reviewed-by.  If any of you gets a chance, please do
>   look them over.

Glad this came out. ;-)

No objection to the patch: feel free to add my Reviewed-by: tag.

(BTW, atomic_set() would be better mapped to WRITE_ONCE()... in fact, to
 be fair, some archs do it the __asm__ __volatile__() way).

I do however have some suggestions concerning "the process":  searching
LKML for the patch and the related discussion, I could only find:

  [PATCH] docs: atomic_ops: atomic_set is a write (not read) operation

and I realize that none of the person Cc:-ed in this thread, except you,
were Cc:-ed in that discussion (in compliance with get_maintainer.pl).

My suggestions:

  1) Merge the file touched by that patch into (the recently created):
  
Documentation/atomic_t.txt

 (FWIW, queued in my TODO list).

  2) Add the entry:

F: Documentation/atomic_t.txt

 to the "ATOMIC INFRASTRUCTURE" subsystem in the MAINTAINERS file so
 that developers can easily find (the intended?) reviewers for their
 patch. (Of course, this will need ACK from the ATOMIC people).

  Andrea

Re: [PATCH 0/2] tools/memory-model: remove ACCESS_ONCE()

2018-06-28 Thread Andrea Parri

On Thu, Jun 28, 2018 at 01:33:45PM +0100, Mark Rutland wrote:
> Since commit:
> 
>   b899a850431e2dd0 ("compiler.h: Remove ACCESS_ONCE()")
> 
> ... there has been no definition of ACCESS_ONCE() in the kernel tree,
> and it has been necessary to use READ_ONCE() or WRITE_ONCE() instead.
> 
> However, since then the kernel memory model was added to the Linux tree,
> sporting new instances of ACCESS_ONCE() in examples and in the memory
> model itself.
> 
> These patches remove the new instances of ACCESS_ONCE() for consistency
> with the contemporary codebase.
> 
> Thanks,
> Mark.
> 
> Mark Rutland (2):
>   tools/memory-model: remove ACCESS_ONCE() from recipes
>   tools/memory-model: remove ACCESS_ONCE() from model

For the series:

Acked-by: Andrea Parri 

Cheers,
  Andrea


> 
>  tools/memory-model/Documentation/recipes.txt | 4 ++--
>  tools/memory-model/linux-kernel.bell | 2 +-
>  2 files changed, 3 insertions(+), 3 deletions(-)
> 
> -- 
> 2.11.0
>

Re: [PATCHv2 03/11] atomics: simplify cmpxchg() instrumentation

2018-06-25 Thread Andrea Parri

On Mon, Jun 25, 2018 at 11:59:44AM +0100, Mark Rutland wrote:
> Currently we define some fairly verbose wrappers for the cmpxchg()
> family so that we can pass a pointer and size into kasan_check_write().
> 
> The wrapper duplicate the size-switching logic necessary in arch code,
> and only work for scalar types. On some architectures, (cmp)xchg are
> used on non-scalar types, and thus the instrumented wrappers need to be
> able to handle this.
> 
> We could take the type-punning logic form {READ,WRITE}_ONCE(), but this
> makes the wrappers even more verbose, and requires several local
> variables in the macros.
> 
> Instead, let's simplify the wrappers into simple macros which:
> 
>  * snapshot the pointer into a single local variable, called __ai_ptr to
>avoid conflicts with variables in the scope of the caller.
> 
>  * call kasan_check_read() on __ai_ptr.

Maybe I'm misreading the diff: aren't you calling kasan_check_write()?
(not sure if it makes a difference in this case/for KTSan, but CMPXCHG
does not necessarily perform a write...)

  Andrea


> 
>  * invoke the arch_ function, passing the original arguments, bar
>__ai_ptr being substituted for ptr.
> 
> There should be no functional change as a result of this patch.
> 
> Signed-off-by: Mark Rutland 
> Cc: Boqun Feng 
> Cc: Dmitry Vyukov 
> Cc: Peter Zijlstra 
> Cc: Will Deacon 
> ---
>  include/asm-generic/atomic-instrumented.h | 100 
> +-
>  1 file changed, 15 insertions(+), 85 deletions(-)
> 
> diff --git a/include/asm-generic/atomic-instrumented.h 
> b/include/asm-generic/atomic-instrumented.h
> index 3c64e95d5ed0..c7c3e4cdd942 100644
> --- a/include/asm-generic/atomic-instrumented.h
> +++ b/include/asm-generic/atomic-instrumented.h
> @@ -408,109 +408,39 @@ static __always_inline bool atomic64_add_negative(s64 
> i, atomic64_t *v)
>  }
>  #endif
>  
> -static __always_inline unsigned long
> -cmpxchg_size(volatile void *ptr, unsigned long old, unsigned long new, int 
> size)
> -{
> - kasan_check_write(ptr, size);
> - switch (size) {
> - case 1:
> - return arch_cmpxchg((u8 *)ptr, (u8)old, (u8)new);
> - case 2:
> - return arch_cmpxchg((u16 *)ptr, (u16)old, (u16)new);
> - case 4:
> - return arch_cmpxchg((u32 *)ptr, (u32)old, (u32)new);
> - case 8:
> - BUILD_BUG_ON(sizeof(unsigned long) != 8);
> - return arch_cmpxchg((u64 *)ptr, (u64)old, (u64)new);
> - }
> - BUILD_BUG();
> - return 0;
> -}
> -
>  #define cmpxchg(ptr, old, new)   
> \
>  ({   \
> - ((__typeof__(*(ptr)))cmpxchg_size((ptr), (unsigned long)(old),  \
> - (unsigned long)(new), sizeof(*(ptr; \
> + typeof(ptr) __ai_ptr = (ptr);   \
> + kasan_check_write(__ai_ptr, sizeof(*__ai_ptr)); \
> + arch_cmpxchg(__ai_ptr, (old), (new));   \
>  })
>  
> -static __always_inline unsigned long
> -sync_cmpxchg_size(volatile void *ptr, unsigned long old, unsigned long new,
> -   int size)
> -{
> - kasan_check_write(ptr, size);
> - switch (size) {
> - case 1:
> - return arch_sync_cmpxchg((u8 *)ptr, (u8)old, (u8)new);
> - case 2:
> - return arch_sync_cmpxchg((u16 *)ptr, (u16)old, (u16)new);
> - case 4:
> - return arch_sync_cmpxchg((u32 *)ptr, (u32)old, (u32)new);
> - case 8:
> - BUILD_BUG_ON(sizeof(unsigned long) != 8);
> - return arch_sync_cmpxchg((u64 *)ptr, (u64)old, (u64)new);
> - }
> - BUILD_BUG();
> - return 0;
> -}
> -
>  #define sync_cmpxchg(ptr, old, new)  \
>  ({   \
> - ((__typeof__(*(ptr)))sync_cmpxchg_size((ptr),   \
> - (unsigned long)(old), (unsigned long)(new), \
> - sizeof(*(ptr;   \
> + typeof(ptr) __ai_ptr = (ptr);   \
> + kasan_check_write(__ai_ptr, sizeof(*__ai_ptr)); \
> + arch_sync_cmpxchg(__ai_ptr, (old), (new));  \
>  })
>  
> -static __always_inline unsigned long
> -cmpxchg_local_size(volatile void *ptr, unsigned long old, unsigned long new,
> -int size)
> -{
> - kasan_check_write(ptr, size);
> - switch (size) {
> - case 1:
> - return arch_cmpxchg_local((u8 *)ptr, (u8)old, (u8)new);
> - case 2:
> - return arch_cmpxchg_local((u16 *)ptr, (u16)old, (u16)new);
> - case 4:
> - return arch_cmpxchg_local((u32 *)ptr, (u32)old, (u32)new);
> - case 8:
> - BUILD_BUG_ON(sizeof(unsigned long) != 8);
> - return arch_cmpxchg_local((u64 *)ptr, (u64)old, (u64)new);
> - }
>

Re: [PATCH 2/2] tools/memory-model: Add write ordering by release-acquire and by locks

2018-06-25 Thread Andrea Parri

On Mon, Jun 25, 2018 at 09:32:29AM +0200, Peter Zijlstra wrote:
> 
> I have yet to digest the rest of the discussion, however:
> 
> On Fri, Jun 22, 2018 at 02:09:04PM -0400, Alan Stern wrote:
> > The LKMM uses the same CAT code for acquire/release and lock/unlock.
> > (In essence, it considers a lock to be an acquire and an unlock to be a
> > release; everything else follows from that.)  Treating one differently
> > from the other in these tests would require some significant changes.
> > It wouldn't be easy.
> 
> That is problematic, acquire+release are very much simpler operations
> than lock+unlock.
> 
> At the very least, lock includes a control-dependency, where acquire
> does not.

I don't see how this is relevant here; roughly, "if something is guaranteed
by a control-dependency, that is also guaranteed by an acquire".  Right? ;)

  Andrea

Re: [PATCH 2/2] tools/memory-model: Add write ordering by release-acquire and by locks

2018-06-25 Thread Andrea Parri

On Fri, Jun 22, 2018 at 07:30:08PM +0100, Will Deacon wrote:
> > > I think the second example would preclude us using LDAPR for load-acquire,

> I don't think it's a moot point. We want new architectures to implement
> acquire/release efficiently, and it's not unlikely that they will have
> acquire loads that are similar in semantics to LDAPR. This patch prevents
> them from doing so,

By this same argument, you should not be a "big fan" of rfi-rel-acq in ppo ;)
consider, e.g., the two litmus tests below: what am I missing?

  Andrea


C MP+fencewmbonceonce+pooncerelease-rfireleaseacquire-poacquireonce

{}

P0(int *x, int *y)
{
WRITE_ONCE(*x, 1);
smp_wmb();
WRITE_ONCE(*y, 1);
}

P1(int *x, int *y, int *z)
{
r0 = READ_ONCE(*y);
smp_store_release(z, 1);
r1 = smp_load_acquire(z);
r2 = READ_ONCE(*x);
}

exists (1:r0=1 /\ 1:r1=1 /\ 1:r2=0)


AArch64 MP+dmb.st+popl-rfilq-poqp
"DMB.STdWW Rfe PodRWPL RfiLQ PodRRQP Fre"
Generator=diyone7 (version 7.49+02(dev))
Prefetch=0:x=F,0:y=W,1:y=F,1:x=T
Com=Rf Fr
Orig=DMB.STdWW Rfe PodRWPL RfiLQ PodRRQP Fre
{
0:X1=x; 0:X3=y;
1:X1=y; 1:X3=z; 1:X6=x;
}
 P0  | P1;
 MOV W0,#1   | LDR W0,[X1]   ;
 STR W0,[X1] | MOV W2,#1 ;
 DMB ST  | STLR W2,[X3]  ;
 MOV W2,#1   | LDAPR W4,[X3] ;
 STR W2,[X3] | LDR W5,[X6]   ;
exists
(1:X0=1 /\ 1:X4=1 /\ 1:X5=0)

Re: [PATCH] MAINTAINERS: Add Daniel Lustig as a LKMM reviewer

2018-06-22 Thread Andrea Parri

> > Thanks.  Unless anyone has any opposition I'll submit the fixed
> > patch as part of my next pull request.
> 
> Works for me, especially if this means that Daniel is RISC-V's official
> representative.  ;-)

I'd rather the "fixed patch" go through the LKMM's tree.  If not for
other, we tend to use get_maintainer.pl on your (TBD ;/) development
branch...


> 
> Acked-by: Paul E. McKenney 
> 
> > commit 9d01337e4724be4d34bfe848a7c64d14bfdb89ea
> > gpg: Signature made Fri 22 Jun 2018 03:35:24 PM PDT
> > gpg:using RSA key 00CE76D1834960DFCE886DF8EF4CA1502CCBAB41
> > gpg:issuer "pal...@dabbelt.com"
> > gpg: Good signature from "Palmer Dabbelt " [ultimate]
> > gpg: aka "Palmer Dabbelt " [ultimate]
> > Author: Palmer Dabbelt 
> > Date:   Fri Jun 22 14:04:42 2018 -0700
> > 
> >MAINTAINERS: Add Daniel Lustig as a LKMM reviewer

Nit: an LKMM


> > 
> >Dan runs the RISC-V memory model working group.  I've been forwarding
> >him LKMM emails that end up in my inbox, but I'm far from an expert in
> >this stuff.  He requested to be added as a reviewer, which seem sane to

Nit: which seems


> >me as it'll take a human out of the loop.
> > 
> >CC: Daniel Lustig 
> >Acked-by: Daniel Lustig 
> >Signed-off-by: Palmer Dabbelt 

Glad to read this!  Please feel free to add:

Acked-by: Andrea Parri 

  Andrea


> > 
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index 9d5eeff51b5f..ac8ed55dbe9b 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -8316,6 +8316,7 @@ M:Jade Alglave 
> > M:  Luc Maranget 
> > M:  "Paul E. McKenney" 
> > R:  Akira Yokosawa 
> > +R: Daniel Lustig 
> > L:  linux-kernel@vger.kernel.org
> > S:  Supported
> > T:  git git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git

Re: [PATCH 2/2] tools/memory-model: Add write ordering by release-acquire and by locks

2018-06-22 Thread Andrea Parri

> > > I also just realised that this prevents Power from using ctrl+isync to
> > > implement acquire, should they wish to do so.
> > 
> > They in fact do so on chips lacking LWSYNC, see how PPC_ACQUIRE_BARRIER
> > (as used by atomic_*_acquire) turns into ISYNC (note however that they
> > do not use PPC_ACQUIRE_BARRIER for smp_load_acquire -- because there's
> > no CTRL there).
> 
> Right, so the example in the commit message is broken on PPC then. I think
> it's also broken on RISC-V, despite the claim.

I agree for RISC-V (and I missed it in my earlier review): the 2nd
snippet from the commit message would map to something like

   fence rw, w
   STORE #1,[x]
   LOAD  [x]
   fence r ,rw
   STORE #1,[y]

and there would be no guarantee that the stores to x and y will be
propagated in program order to another CPU, AFAICT.  Thank you for
pointing this out.

  Andrea

Re: [PATCH 2/2] tools/memory-model: Add write ordering by release-acquire and by locks

2018-06-22 Thread Andrea Parri

On Thu, Jun 21, 2018 at 01:27:12PM -0400, Alan Stern wrote:
> More than one kernel developer has expressed the opinion that the LKMM
> should enforce ordering of writes by release-acquire chains and by
> locking.  In other words, given the following code:
> 
>   WRITE_ONCE(x, 1);
>   spin_unlock(&s):
>   spin_lock(&s);
>   WRITE_ONCE(y, 1);
> 
> or the following:
> 
>   smp_store_release(&x, 1);
>   r1 = smp_load_acquire(&x);  // r1 = 1
>   WRITE_ONCE(y, 1);
> 
> the stores to x and y should be propagated in order to all other CPUs,
> even though those other CPUs might not access the lock s or be part of
> the release-acquire chain.  In terms of the memory model, this means
> that rel-rf-acq-po should be part of the cumul-fence relation.
> 
> All the architectures supported by the Linux kernel (including RISC-V)
> do behave this way, albeit for varying reasons.  Therefore this patch
> changes the model in accordance with the developers' wishes.
> 
> Signed-off-by: Alan Stern 

This patch changes the "Result" for ISA2+pooncelock+pooncelock+pombonce,
so it should update the corresponding comment/README.

Reviewed-and-Tested-by: Andrea Parri 

  Andrea


> 
> ---
> 
> 
> [as1871]
> 
> 
>  tools/memory-model/Documentation/explanation.txt |   81 
> +++
>  tools/memory-model/linux-kernel.cat  |2 
>  2 files changed, 82 insertions(+), 1 deletion(-)
> 
> Index: usb-4.x/tools/memory-model/linux-kernel.cat
> ===
> --- usb-4.x.orig/tools/memory-model/linux-kernel.cat
> +++ usb-4.x/tools/memory-model/linux-kernel.cat
> @@ -66,7 +66,7 @@ let ppo = to-r | to-w | fence
>  
>  (* Propagation: Ordering from release operations and strong fences. *)
>  let A-cumul(r) = rfe? ; r
> -let cumul-fence = A-cumul(strong-fence | po-rel) | wmb
> +let cumul-fence = A-cumul(strong-fence | po-rel) | wmb | rel-rf-acq-po
>  let prop = (overwrite & ext)? ; cumul-fence* ; rfe?
>  
>  (*
> Index: usb-4.x/tools/memory-model/Documentation/explanation.txt
> ===
> --- usb-4.x.orig/tools/memory-model/Documentation/explanation.txt
> +++ usb-4.x/tools/memory-model/Documentation/explanation.txt
> @@ -1897,3 +1897,84 @@ non-deadlocking executions.  For example
>  Is it possible to end up with r0 = 36 at the end?  The LKMM will tell
>  you it is not, but the model won't mention that this is because P1
>  will self-deadlock in the executions where it stores 36 in y.
> +
> +In the LKMM, locks and release-acquire chains cause stores to
> +propagate in order.  For example:
> +
> + int x, y, z;
> +
> + P0()
> + {
> + WRITE_ONCE(x, 1);
> + smp_store_release(&y, 1);
> + }
> +
> + P1()
> + {
> + int r1;
> +
> + r1 = smp_load_acquire(&y);
> + WRITE_ONCE(z, 1);
> + }
> +
> + P2()
> + {
> + int r2, r3, r4;
> +
> + r2 = READ_ONCE(z);
> + smp_rmb();
> + r3 = READ_ONCE(x);
> + r4 = READ_ONCE(y);
> + }
> +
> +If r1 = 1 and r2 = 1 at the end, then both r3 and r4 must also be 1.
> +In other words, the smp_store_release() read by the smp_load_acquire()
> +together act as a sort of inter-processor fence, forcing the stores to
> +x and y to propagate to P2 before the store to z does, regardless of
> +the fact that P2 doesn't execute any release or acquire instructions.
> +This conclusion would hold even if P0 and P1 were on the same CPU, so
> +long as r1 = 1.
> +
> +We have mentioned that the LKMM treats locks as acquires and unlocks
> +as releases.  Therefore it should not be surprising that something
> +analogous to this ordering also holds for locks:
> +
> + int x, y;
> + spinlock_t s;
> +
> + P0()
> + {
> + spin_lock(&s);
> + WRITE_ONCE(x, 1);
> + spin_unlock(&s);
> + }
> +
> + P1()
> + {
> + int r1;
> +
> + spin_lock(&s);
> + r1 = READ_ONCE(x):
> + WRITE_ONCE(y, 1);
> + spin_unlock(&s);
> + }
> +
> + P2()
> + {
> + int r2, r3;
> +
> + r2 = READ_ONCE(y);
> + smp_rmb();
> + r3 = READ_ONCE(x);
> + }
> +
> +If r1 = 1 at the end (implying that P1's critical section executes
> +after P0's) and r2 = 1, then r3 must be 1; the ordering of the
> +critical sections forces the store to x to propagate to P2 before the
> +store to y does.
> +
> +In both versions of this scenario, the store-propagation ordering is
> +not required by the operational model.  However, it does happen on all
> +the architectures supporting the Linux kernel, and kernel developers
> +seem to expect it; they have requested that this behavior be included
> +in the LKMM.
>

Re: [PATCH 1/2] tools/memory-model: Change rel-rfi-acq ordering to (rel-rf-acq-po & int)

2018-06-22 Thread Andrea Parri

On Thu, Jun 21, 2018 at 01:26:49PM -0400, Alan Stern wrote:
> This patch changes the LKMM rule which says that an acquire which
> reads from an earlier release must be executed after that release (in
> other words, the release cannot be forwarded to the acquire).  This is
> not true on PowerPC, for example.
> 
> What is true instead is that any instruction following the acquire
> must be executed after the release.  On some architectures this is
> because a write-release cannot be forwarded to a read-acquire; on
> others (including PowerPC) it is because the implementation of
> smp_load_acquire() places a memory barrier immediately after the
> load.
> 
> This change to the model does not cause any change to the model's
> predictions.  This is because any link starting from a load must be an
> instance of either po or fr.  In the po case, the new rule will still
> provide ordering.  In the fr case, we also have ordering because there
> must be a co link to the same destination starting from the
> write-release.
> 
> Signed-off-by: Alan Stern 

Reviewed-by: Andrea Parri 

  Andrea


> 
> ---
> 
> 
> [as1870]
> 
> 
>  tools/memory-model/Documentation/explanation.txt |   35 
> ---
>  tools/memory-model/linux-kernel.cat  |6 +--
>  2 files changed, 22 insertions(+), 19 deletions(-)
> 
> Index: usb-4.x/tools/memory-model/linux-kernel.cat
> ===
> --- usb-4.x.orig/tools/memory-model/linux-kernel.cat
> +++ usb-4.x/tools/memory-model/linux-kernel.cat
> @@ -38,7 +38,7 @@ let strong-fence = mb | gp
>  (* Release Acquire *)
>  let acq-po = [Acquire] ; po ; [M]
>  let po-rel = [M] ; po ; [Release]
> -let rfi-rel-acq = [Release] ; rfi ; [Acquire]
> +let rel-rf-acq-po = [Release] ; rf ; [Acquire] ; po
>  
>  (**)
>  (* Fundamental coherence ordering *)
> @@ -60,9 +60,9 @@ let dep = addr | data
>  let rwdep = (dep | ctrl) ; [W]
>  let overwrite = co | fr
>  let to-w = rwdep | (overwrite & int)
> -let to-r = addr | (dep ; rfi) | rfi-rel-acq
> +let to-r = addr | (dep ; rfi)
>  let fence = strong-fence | wmb | po-rel | rmb | acq-po
> -let ppo = to-r | to-w | fence
> +let ppo = to-r | to-w | fence | (rel-rf-acq-po & int)
>  
>  (* Propagation: Ordering from release operations and strong fences. *)
>  let A-cumul(r) = rfe? ; r
> Index: usb-4.x/tools/memory-model/Documentation/explanation.txt
> ===
> --- usb-4.x.orig/tools/memory-model/Documentation/explanation.txt
> +++ usb-4.x/tools/memory-model/Documentation/explanation.txt
> @@ -1067,27 +1067,30 @@ allowing out-of-order writes like this t
>  violating the write-write coherence rule by requiring the CPU not to
>  send the W write to the memory subsystem at all!)
>  
> -There is one last example of preserved program order in the LKMM: when
> -a load-acquire reads from an earlier store-release.  For example:
> +There is one last example of preserved program order in the LKMM; it
> +applies to instructions po-after a load-acquire which reads from an
> +earlier store-release.  For example:
>  
>   smp_store_release(&x, 123);
>   r1 = smp_load_acquire(&x);
> + WRITE_ONCE(&y, 246);
>  
>  If the smp_load_acquire() ends up obtaining the 123 value that was
> -stored by the smp_store_release(), the LKMM says that the load must be
> -executed after the store; the store cannot be forwarded to the load.
> -This requirement does not arise from the operational model, but it
> -yields correct predictions on all architectures supported by the Linux
> -kernel, although for differing reasons.
> -
> -On some architectures, including x86 and ARMv8, it is true that the
> -store cannot be forwarded to the load.  On others, including PowerPC
> -and ARMv7, smp_store_release() generates object code that starts with
> -a fence and smp_load_acquire() generates object code that ends with a
> -fence.  The upshot is that even though the store may be forwarded to
> -the load, it is still true that any instruction preceding the store
> -will be executed before the load or any following instructions, and
> -the store will be executed before any instruction following the load.
> +written by the smp_store_release(), the LKMM says that the store to y
> +must be executed after the store to x.  In fact, the only way this
> +could fail would be if the store-release was forwarded to the
> +load-acquire; the LKMM says it holds even in that case.  This
> +requirement does not arise from the operational model, but it yields
> +correct predictions on all architectures supported by the Linux
> +kerne

Re: [PATCH v4 1/3] compiler-gcc.h: add gnu_inline to all inline declarations

2018-06-08 Thread Andrea Parri

On Fri, Jun 08, 2018 at 12:04:36PM +0200, Sedat Dilek wrote:
> On Fri, Jun 8, 2018 at 9:59 AM, Arnd Bergmann  wrote:
> > On Thu, Jun 7, 2018 at 10:49 PM, Nick Desaulniers
> >  wrote:
> >> Functions marked extern inline do not emit an externally visible
> >> function when the gnu89 C standard is used. Some KBUILD Makefiles
> >> overwrite KBUILD_CFLAGS. This is an issue for GCC 5.1+ users as without
> >> an explicit C standard specified, the default is gnu11. Since c99, the
> >> semantics of extern inline have changed such that an externally visible
> >> function is always emitted. This can lead to multiple definition errors
> >> of extern inline functions at link time of compilation units whose build
> >> files have removed an explicit C standard compiler flag for users of GCC
> >> 5.1+ or Clang.
> >>
> >> Signed-off-by: Nick Desaulniers 
> >> Suggested-by: H. Peter Anvin 
> >> Suggested-by: Joe Perches 
> >
> > I suspect this will break Geert's gcc-4.1.2, which I think doesn't have that
> > attribute yet (4.1.3 or higher have it according to the documentation.
> >
> > It wouldn't be hard to work around that if we want to keep that version
> > working, or we could decide that it's time to officially stop supporting
> > that version, but we should probably decide on one or the other.
> >
> 
> Good point.
> What is the minimum requirement of GCC version currently?

Good question ;-)  (I recently had the impression that
Documentation/process/changes.rst was making fun of me ;-)


> AFAICS x86/asm-goto support requires GCC >= 4.5?
> 
> Just FYI...
> ...saw the last days in upstream commits that kbuild/kconfig for
> 4.18-rc1 offers possibilities to check for cc-version dependencies.

Good to know!  Mind retrieving/sharing the commit id(s)
or links to the corresponding discussion on LKML?

Thanks,
  Andrea


> 
> - sed@ -

Re: [PATCH V5] powercap/drivers/idle_injection: Add an idle injection framework

2018-06-06 Thread Andrea Parri

Hi Daniel, Viresh,

On Wed, Jun 06, 2018 at 04:15:28PM +0530, Viresh Kumar wrote:
> On 06-06-18, 12:22, Daniel Lezcano wrote:
> > (mb() are done in the atomic operations AFAICT).

To do my bit, not all atomic ops do/imply memory barriers; e.g.,

  [from Documentation/atomic_t.txt]

  - non-RMW operations [e.g., atomic_set()] are unordered

  - RMW operations that have no return value [e.g., atomic_inc()] are unordered


> 
> AFAIU, it is required to make sure the operations are seen in a particular 
> order
> on another CPU and the compiler doesn't reorganize code to optimize it.
> 
> For example, in our case what if the compiler reorganizes the atomic-set
> operation after wakeup-process ? But maybe that wouldn't happen across 
> function
> calls and we should be safe then.

IIUC, wake_up_process() implies a full memory barrier and a compiler barrier,
due to:

  raw_spin_lock_irqsave(&p->pi_lock, flags);
  smp_mb__after_spinlock();

The pattern under discussion isn't clear to me, but if you'll end up relying
on this "implicit" barrier I'd suggest documenting it with a comment.

  Andrea

Re: [PATCH RFC tools/memory-model] Add litmus-test naming scheme

2018-05-29 Thread Andrea Parri

[...]

> > Right, thanks.  Ah, maybe we should strive to meet the 80-chars bound
> > by splitting the command with "\"?
> 
> We could, but combined with your later request for indentation, we end
> up with something like this:
> 
>   $ norm7 -bell linux-kernel.bell \
>   Rfi Once PodRR Once Fre Once Rfi Once PodRR Once Fre Once | \
> sed -e 's/:.*//g'
>   SB+rfionceonce-poonceonces
> 
> In the immortal words of MSDOS, are you sure?  ;-)

I find it more readable, but it's just taste ;-)  Commands are indented
with 2 spaces in the other README.


> > Well, "Rfi" produces "rfi" while "PosWR" produces "pos" for a name...
> 
> Right you are!  How about this, then?
> 
> Rfi: Read-from internal.  The current process wrote a variable and then
> immediately read the value back from it.  For the purposes of
> litmus-test code generation, Rfi acts identically to PosWR.
> However, they differ for purposes of naming, and they also result
> in different "exists" clauses.
>   Example:  ???

LGTM, thanks.

  Andrea

[PATCH] tools/memory-model: Rename litmus tests to comply to norm7

2018-05-29 Thread Andrea Parri

norm7 produces the 'normalized' name of a litmus test,  when the test
can be generated from a single cycle that passes through each process
exactly once. The commit renames such tests in order to comply to the
naming scheme implemented by this tool.

Signed-off-by: Andrea Parri 
Cc: Alan Stern 
Cc: Will Deacon 
Cc: Peter Zijlstra 
Cc: Boqun Feng 
Cc: Nicholas Piggin 
Cc: David Howells 
Cc: Jade Alglave 
Cc: Luc Maranget 
Cc: "Paul E. McKenney" 
Cc: Akira Yokosawa 
---
 tools/memory-model/Documentation/recipes.txt   |  8 ++--
 tools/memory-model/README  | 20 +-
 .../IRIW+fencembonceonces+OnceOnce.litmus  | 45 ++
 .../litmus-tests/IRIW+mbonceonces+OnceOnce.litmus  | 45 --
 .../litmus-tests/LB+ctrlonceonce+mbonceonce.litmus | 34 
 .../LB+fencembonceonce+ctrlonceonce.litmus | 34 
 .../MP+fencewmbonceonce+fencermbonceonce.litmus| 30 +++
 .../litmus-tests/MP+wmbonceonce+rmbonceonce.litmus | 30 ---
 .../litmus-tests/R+fencembonceonces.litmus | 30 +++
 .../memory-model/litmus-tests/R+mbonceonces.litmus | 30 ---
 tools/memory-model/litmus-tests/README | 16 
 .../S+fencewmbonceonce+poacquireonce.litmus| 27 +
 .../S+wmbonceonce+poacquireonce.litmus | 27 -
 .../litmus-tests/SB+fencembonceonces.litmus| 32 +++
 .../litmus-tests/SB+mbonceonces.litmus | 32 ---
 .../WRC+pooncerelease+fencermbonceonce+Once.litmus | 38 ++
 .../WRC+pooncerelease+rmbonceonce+Once.litmus  | 38 --
 ...release+poacquirerelease+fencembonceonce.litmus | 42 
 ...ooncerelease+poacquirerelease+mbonceonce.litmus | 42 
 19 files changed, 300 insertions(+), 300 deletions(-)
 create mode 100644 
tools/memory-model/litmus-tests/IRIW+fencembonceonces+OnceOnce.litmus
 delete mode 100644 
tools/memory-model/litmus-tests/IRIW+mbonceonces+OnceOnce.litmus
 delete mode 100644 
tools/memory-model/litmus-tests/LB+ctrlonceonce+mbonceonce.litmus
 create mode 100644 
tools/memory-model/litmus-tests/LB+fencembonceonce+ctrlonceonce.litmus
 create mode 100644 
tools/memory-model/litmus-tests/MP+fencewmbonceonce+fencermbonceonce.litmus
 delete mode 100644 
tools/memory-model/litmus-tests/MP+wmbonceonce+rmbonceonce.litmus
 create mode 100644 tools/memory-model/litmus-tests/R+fencembonceonces.litmus
 delete mode 100644 tools/memory-model/litmus-tests/R+mbonceonces.litmus
 create mode 100644 
tools/memory-model/litmus-tests/S+fencewmbonceonce+poacquireonce.litmus
 delete mode 100644 
tools/memory-model/litmus-tests/S+wmbonceonce+poacquireonce.litmus
 create mode 100644 tools/memory-model/litmus-tests/SB+fencembonceonces.litmus
 delete mode 100644 tools/memory-model/litmus-tests/SB+mbonceonces.litmus
 create mode 100644 
tools/memory-model/litmus-tests/WRC+pooncerelease+fencermbonceonce+Once.litmus
 delete mode 100644 
tools/memory-model/litmus-tests/WRC+pooncerelease+rmbonceonce+Once.litmus
 create mode 100644 
tools/memory-model/litmus-tests/Z6.0+pooncerelease+poacquirerelease+fencembonceonce.litmus
 delete mode 100644 
tools/memory-model/litmus-tests/Z6.0+pooncerelease+poacquirerelease+mbonceonce.litmus

diff --git a/tools/memory-model/Documentation/recipes.txt 
b/tools/memory-model/Documentation/recipes.txt
index ee4309a87fc45..a40802fa1099e 100644
--- a/tools/memory-model/Documentation/recipes.txt
+++ b/tools/memory-model/Documentation/recipes.txt
@@ -126,7 +126,7 @@ However, it is not necessarily the case that accesses 
ordered by
 locking will be seen as ordered by CPUs not holding that lock.
 Consider this example:
 
-   /* See Z6.0+pooncelock+pooncelock+pombonce.litmus. */
+   /* See Z6.0+pooncerelease+poacquirerelease+fencembonceonce.litmus. */
void CPU0(void)
{
spin_lock(&mylock);
@@ -292,7 +292,7 @@ and to use smp_load_acquire() instead of smp_rmb().  
However, the older
 smp_wmb() and smp_rmb() APIs are still heavily used, so it is important
 to understand their use cases.  The general approach is shown below:
 
-   /* See MP+wmbonceonce+rmbonceonce.litmus. */
+   /* See MP+fencewmbonceonce+fencermbonceonce.litmus. */
void CPU0(void)
{
WRITE_ONCE(x, 1);
@@ -360,7 +360,7 @@ can be seen in the LB+poonceonces.litmus litmus test.
 One way of avoiding the counter-intuitive outcome is through the use of a
 control dependency paired with a full memory barrier:
 
-   /* See LB+ctrlonceonce+mbonceonce.litmus. */
+   /* See LB+fencembonceonce+ctrlonceonce.litmus. */
void CPU0(void)
{
r0 = READ_ONCE(x);
@@ -476,7 +476,7 @@ that one CPU first stores to one variable and then loads 
from a second,
 while another CPU stores to the second variable and then loads from the

Re: [PATCH RFC tools/memory-model] Add litmus-test naming scheme

2018-05-29 Thread Andrea Parri

FenceMbdRW FenceMbdW* 
> FenceMbdWR FenceMbdWW FenceMbs** FenceMbs*R FenceMbs*W FenceMbsR* FenceMbsRR 
> FenceMbsRW FenceMbsW* FenceMbsWR FenceMbsWW FenceRcu-lock FenceRcu-lockd** 
> FenceRcu-lockd*R FenceRcu-lockd*W FenceRcu-lockdR* FenceRcu-lockdRR 
> FenceRcu-lockdRW FenceRcu-lockdW* FenceRcu-lockdWR FenceRcu-lockdWW 
> FenceRcu-locks** FenceRcu-locks*R FenceRcu-locks*W FenceRcu-locksR* 
> FenceRcu-locksRR FenceRcu-locksRW FenceRcu-locksW* FenceRcu-locksWR 
> FenceRcu-locksWW FenceRcu-unlock FenceRcu-unlockd** FenceRcu-unlockd*R 
> FenceRcu-unlockd*W FenceRcu-unlockdR* FenceRcu-unlockdRR FenceRcu-unlockdRW 
> FenceRcu-unlockdW* FenceRcu-unlockdWR FenceRcu-unlockdWW FenceRcu-unlocks** 
> FenceRcu-unlocks*R FenceRcu-unlocks*W FenceRcu-unlocksR* FenceRcu-unlocksRR 
> FenceRcu-unlocksRW FenceRcu-unlocksW* FenceRcu-unlocksWR FenceRcu-unlocksWW 
> FenceRmb FenceRmbd** FenceRmbd*R FenceRmbd*W FenceRmbdR* FenceRmbdRR 
> FenceRmbdRW FenceRmbdW* FenceRmbdWR FenceRmbdWW FenceRmbs** FenceRmbs*R 
> FenceRmbs*W FenceRmbsR* FenceRmbsRR FenceRmbsRW FenceRmbsW* FenceRmbsWR 
> FenceRmbsWW FenceSync-rcu FenceSync-rcud** FenceSync-rcud*R FenceSync-rcud*W 
> FenceSync-rcudR* FenceSync-rcudRR FenceSync-rcudRW FenceSync-rcudW* 
> FenceSync-rcudWR FenceSync-rcudWW FenceSync-rcus** FenceSync-rcus*R 
> FenceSync-rcus*W FenceSync-rcusR* FenceSync-rcusRR FenceSync-rcusRW 
> FenceSync-rcusW* FenceSync-rcusWR FenceSync-rcusWW FenceWmb FenceWmbd** 
> FenceWmbd*R FenceWmbd*W FenceWmbdR* FenceWmbdRR FenceWmbdRW FenceWmbdW* 
> FenceWmbdWR FenceWmbdWW FenceWmbs** FenceWmbs*R FenceWmbs*W FenceWmbsR* 
> FenceWmbsRR FenceWmbsRW FenceWmbsW* FenceWmbsWR FenceWmbsWW Fenced** Fenced*R 
> Fenced*W FencedR* FencedRR FencedRW FencedW* FencedWR FencedWW Fences** 
> Fences*R Fences*W FencesR* FencesRR FencesRW FencesW* FencesWR FencesWW 
> FrBack FrLeave Fre Fri Hat Na Pod** Pod*R Pod*W PodR* PodRR PodRW PodW* PodWR 
> PodWW Pos** Pos*R Pos*W PosR* PosRR PosRW PosW* PosWR PosWW R Read RfBack 
> RfLeave Rfe Rfi Rmw W Write WsBack WsLeave Wse Wsi
> 
> I added the following at the end:
> 
> Please note that the above is a partial list.  To see the full list of
> descriptors, execute the following command:
> 
> $ diyone7 -bell linux-kernel.bell -show edges

Thanks.  One more nit: I'd indent this and the above "norm7" commands as
we do in our "main" README.


> 
> > I also notice that our current names for tests with fences (and cycle)
> > deviate from the corresponding 'norm7' results; e.g.,
> > 
> >   $ norm7 -bell linux-kernel.bell FenceWmbdWW Once Rfe Once FenceRmbdRR 
> > Once Fre Once | sed -e 's/:.*//g'
> >   MP+fencewmbonceonce+fencermbonceonce
> > 
> > while we use 'MP+wmbonceonce+rmbonceonce' (that is, we omit the 'fence'
> > prefixes).
> 
> Would you be willing to send me a patch fixing them up?

Yes, I'll work this out.

  Andrea


> 
> Please see below for updated patch.
> 
>   Thanx, Paul
> 
> 
> 
> commit 04a897a8e202e8d79dd47910321f0e8efb081854
> Author: Paul E. McKenney 
> Date:   Fri May 25 12:02:53 2018 -0700
> 
> EXP tools/memory-model: Add litmus-test naming scheme
> 
> This commit documents the scheme used to generate the names for the
> litmus tests.
> 
> Signed-off-by: Paul E. McKenney 
> [ paulmck: Apply feedback from Andrea Parri. ]
> 
> diff --git a/tools/memory-model/litmus-tests/README 
> b/tools/memory-model/litmus-tests/README
> index 00140aaf58b7..9c0ea65c5362 100644
> --- a/tools/memory-model/litmus-tests/README
> +++ b/tools/memory-model/litmus-tests/README
> @@ -1,4 +1,6 @@
> -This directory contains the following litmus tests:
> +
> +LITMUS TESTS
> +
>  
>  CoRR+poonceonce+Once.litmus
>   Test of read-read coherence, that is, whether or not two
> @@ -151,3 +153,143 @@ Z6.0+pooncerelease+poacquirerelease+mbonceonce.litmus
>  A great many more litmus tests are available here:
>  
>   https://github.com/paulmckrcu/litmus
> +
> +==
> +LITMUS TEST NAMING
> +==
> +
> +Litmus tests are usually named based on their contents, which means that
> +looking at the name tells you what the litmus test does.  The naming
> +scheme covers litmus tests having a single cycle that passes through
> +each process exactly once, so litmus tests not fitting this description
> +are named on an ad-hoc basis.
> +
> +The structure of a litmus-test name is the litmus-test class, a plus
> +sign ("+"), and one string for each process, separated b

Re: [PATCH RFC tools/memory-model] Add litmus-test naming scheme

2018-05-28 Thread Andrea Parri

On Fri, May 25, 2018 at 12:10:20PM -0700, Paul E. McKenney wrote:
> This commit documents the scheme used to generate the names for the
> litmus tests.
> 
> Signed-off-by: Paul E. McKenney 
> ---
>  README |  136 
> -
>  1 file changed, 135 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/memory-model/litmus-tests/README 
> b/tools/memory-model/litmus-tests/README
> index 00140aaf58b7..b81f51054cd3 100644
> --- a/tools/memory-model/litmus-tests/README
> +++ b/tools/memory-model/litmus-tests/README
> @@ -1,4 +1,6 @@
> -This directory contains the following litmus tests:
> +
> +LITMUS TESTS
> +
>  
>  CoRR+poonceonce+Once.litmus
>   Test of read-read coherence, that is, whether or not two
> @@ -151,3 +153,135 @@ Z6.0+pooncerelease+poacquirerelease+mbonceonce.litmus
>  A great many more litmus tests are available here:
>  
>   https://github.com/paulmckrcu/litmus
> +
> +==
> +LITMUS TEST NAMING
> +==
> +
> +Litmus tests are usually named based on their contents, which means that
> +looking at the name tells you what the litmus test does.  The naming
> +scheme covers litmus tests having a single cycle that passes through
> +each process exactly once, so litmus tests not fitting this description
> +are named on an ad-hoc basis.
> +
> +The structure of a litmus-test name is the litmus-test class, a plus
> +sign ("+"), and one string for each process, separated by plus signs.
> +The end of the name is ".litmus".

We used to distinguigh between the test name and the test filename; we
currently have only one test whose name ends with .litmus:

  ISA2+pooncelock+pooncelock+pombonce.litmus

(that I missed until now...).


> +
> +The litmus-test classes may be found in the infamous test6.pdf:
> +https://www.cl.cam.ac.uk/~pes20/ppc-supplemental/test6.pdf
> +Each class defines the pattern of accesses and of the variables accessed.
> +For example, if the one process writes to a pair of variables, and
> +the other process reads from these same variables, the corresponding
> +litmus-test class is "MP" (message passing), which may be found on the
> +left-hand end of the second row of tests on page one of test6.pdf.
> +
> +The strings used to identify the actions carried out by each process are
> +complex due to a desire to have finite-length names.

I'm not sure what you mean here: can you elaborate/rephrase?


> Thus, there is a
> +tool to generate these strings from a given litmus test's actions.  For
> +example, consider the processes from SB+rfionceonce-poonceonces.litmus:
> +
> + P0(int *x, int *y)
> + {
> + int r1;
> + int r2;
> +
> + WRITE_ONCE(*x, 1);
> + r1 = READ_ONCE(*x);
> + r2 = READ_ONCE(*y);
> + }
> +
> + P1(int *x, int *y)
> + {
> + int r3;
> + int r4;
> +
> + WRITE_ONCE(*y, 1);
> + r3 = READ_ONCE(*y);
> + r4 = READ_ONCE(*x);
> + }
> +
> +The next step is to construct a space-separated list of descriptors,
> +interleaving descriptions of the relation between a pair of consecutive
> +accesses with descriptions of the second access in the pair.
> +
> +P0()'s WRITE_ONCE() is read by its first READ_ONCE(), which is a
> +reads-from link (rf) and internal to the P0() process.  This is
> +"rfi", which is an abbreviation for "reads-from internal".  Because
> +some of the tools string these abbreviations together with space
> +characters separating processes, the first character is capitalized,
> +resulting in "Rfi".
> +
> +P0()'s second access is a READ_ONCE(), as opposed to (for example)
> +smp_load_acquire(), so next is "Once".  Thus far, we have "Rfi Once".
> +
> +P0()'s third access is also a READ_ONCE(), but to y rather than x.
> +This is related to P0()'s second access by program order ("po"),
> +to a different variable ("d"), and both accesses are reads ("RR").
> +The resulting descriptor is "PodRR".  Because P0()'s third access is
> +READ_ONCE(), we add another "Once" descriptor.
> +
> +A from-read ("fre") relation links P0()'s third to P1()'s first
> +access, and the resulting descriptor is "Fre".  P1()'s first access is
> +WRITE_ONCE(), which as before gives the descriptor "Once".  The string
> +thus far is thus "Rfi Once PodRR Once Fre Once".
> +
> +The remainder of P1() is similar to P0(), which means we add
> +"Rfi Once PodRR Once".  Another fre links P1()'s last access to
> +P0()'s first access, which is WRITE_ONCE(), so we add "Fre Once".
> +The full string is thus:
> +
> + Rfi Once PodRR Once Fre Once Rfi Once PodRR Once Fre Once
> +
> +This string can be given to the "norm7" and "classify7" tools to
> +produce the name:
> +
> +$ norm7 -bell linux-kernel.bell Rfi Once PodRR Once Fre Once Rfi Once PodRR 
> Once Fre Once |  classify7 -bell linux-kernel.bell -diyone | sed -e 's/:.*//g'
> +SB+rfionceonce-poonceonces

We should ch

Re: [PATCH 0/2] mm->owner to mm->memcg fixes

2018-05-24 Thread Andrea Parri

On Thu, May 24, 2018 at 02:16:35PM -0700, Andrew Morton wrote:
> On Thu, 24 May 2018 13:10:02 +0200 Michal Hocko  wrote:
> 
> > I would really prefer and appreciate a repost with all the fixes folded
> > in.
> 
> [1/2]
> 
> From: "Eric W. Biederman" 
> Subject: memcg: replace mm->owner with mm->memcg
> 
> Recently it was reported that mm_update_next_owner could get into cases
> where it was executing its fallback for_each_process part of the loop and
> thus taking up a lot of time.

Reference?


> 
> To deal with this replace mm->owner with mm->memcg.  This just reduces the
> complexity of everything.

"the complexity of everything"?


> As much as possible I have maintained the
> current semantics.

"As much as possible"?


> There are two siginificant exceptions.

s/siginificant/significant


> During fork
> the memcg of the process calling fork is charged rather than init_css_set.
> During memory cgroup migration the charges are migrated not if the
> process is the owner of the mm, but if the process being migrated has the
> same memory cgroup as the mm.
> 
> I believe it was a bug

It was a bug or not??


> if init_css_set is charged for memory activity
> during fork, and the old behavior was simply a consequence of the new task
> not having tsk->cgroup not initialized to it's proper cgroup.
> 
> During cgroup migration only thread group leaders are allowed to migrate. 
> Which means in practice there should only be one.

"in practice there should"??


> Linux tasks created
> with CLONE_VM are the only exception, but the common cases are already
> ruled out.  Processes created with vfork have a suspended parent and can
> do nothing but call exec so they should never show up.  Threads of the
> same cgroup are not the thread group leader so also should not show up. 
> That leaves the old LinuxThreads library which is probably out of use by

"probably"???


> now, and someone doing something very creative with cgroups,

"very creative"?


> and rolling
> their own threads with CLONE_VM.  So in practice I don't think

"in practice I don't think"??

  Andrea


> the
> difference charge migration will affect anyone.
> 
> To ensure that mm->memcg is updated appropriately I have implemented
> cgroup "attach" and "fork" methods.  This ensures that at those points the
> mm pointed to the task has the appropriate memory cgroup.
> 
> For simplicity instead of introducing a new mm lock I simply use exchange
> on the pointer where the mm->memcg is updated to get atomic updates.
> 
> Looking at the history effectively this change is a revert.  The reason
> given for adding mm->owner is so that multiple cgroups can be attached to
> the same mm.  In the last 8 years a second user of mm->owner has not
> appeared.  A feature that has never used, makes the code more complicated
> and has horrible worst case performance should go.
> 
> [ebied...@xmission.com: update to work when !CONFIG_MMU]
>   Link: http://lkml.kernel.org/r/87lgczcox0.fsf...@xmission.com
> [ebied...@xmission.com: close race between migration and installing bprm->mm 
> as mm]
>   Link: http://lkml.kernel.org/r/87fu37cow4.fsf...@xmission.com
> Link: http://lkml.kernel.org/r/87lgd1zww0.fsf...@xmission.com
> Fixes: cf475ad28ac3 ("cgroups: add an owner to the mm_struct")
> Signed-off-by: "Eric W. Biederman" 
> Reported-by: Kirill Tkhai 
> Acked-by: Johannes Weiner 
> Cc: Michal Hocko 
> Cc: "Kirill A. Shutemov" 
> Cc: Tejun Heo 
> Cc: Oleg Nesterov 
> Signed-off-by: Andrew Morton 
> ---
> 
>  fs/exec.c  |3 -
>  include/linux/memcontrol.h |   16 +-
>  include/linux/mm_types.h   |   12 
>  include/linux/sched/mm.h   |8 ---
>  kernel/exit.c  |   89 ---
>  kernel/fork.c  |   17 +-
>  mm/debug.c |4 -
>  mm/memcontrol.c|   81 +++
>  8 files changed, 93 insertions(+), 137 deletions(-)
> 
> diff -puN fs/exec.c~memcg-replace-mm-owner-with-mm-memcg fs/exec.c
> --- a/fs/exec.c~memcg-replace-mm-owner-with-mm-memcg
> +++ a/fs/exec.c
> @@ -1040,11 +1040,12 @@ static int exec_mmap(struct mm_struct *m
>   up_read(&old_mm->mmap_sem);
>   BUG_ON(active_mm != old_mm);
>   setmax_mm_hiwater_rss(&tsk->signal->maxrss, old_mm);
> - mm_update_next_owner(old_mm);
>   mmput(old_mm);
>   return 0;
>   }
>   mmdrop(active_mm);
> + /* The tsk may have migrated before the new mm was attached */
> + mm_sync_memcg_from_task(tsk);
>   return 0;
>  }
>  
> diff -puN include/linux/memcontrol.h~memcg-replace-mm-owner-with-mm-memcg 
> include/linux/memcontrol.h
> --- a/include/linux/memcontrol.h~memcg-replace-mm-owner-with-mm-memcg
> +++ a/include/linux/memcontrol.h
> @@ -345,7 +345,6 @@ out:
>  struct lruvec *mem_cgroup_page_lruvec(struct page *, struct pglist_data *);
>  
>  bool task_in_mem_cgroup(struct task_struct *task, struct mem_cgroup *memcg);
> -

Re: [PATCH 6/9] asm-generic/bitops/atomic.h: Rewrite using atomic_fetch_*

2018-05-24 Thread Andrea Parri

Hi Mark,

> As an aside, If I complete the autogeneration stuff, it'll be possible
> to generate those. I split out the necessary barriers in [1], but I
> still have a lot of other preparatory cleanup to do.

I do grasp the rationale behind that naming:

  __atomic_mb_{before,after}_{acquire,release,fence}()

and yet I remain puzzled by it:

For example, can you imagine (using):

  __atomic_mb_before_acquire() ?

(as your __atomic_mb_after_acquire() is whispering me "acquire-fences"...)

Another example:

  the "atomic" in that "smp_mb__{before,after}_atomic" is so "suggestive"!
   
(think at x86...), but it's not explicit in the proposed names.

I don't have other names to suggest at the moment...  ;/ (aka just saying)

  Andrea


> 
> IIUC, the void-returning atomic ops are relaxed, so trying to unify that
> with the usual rule that no suffix means fence will slow things down
> unless we want to do a treewide substitition to fixup for that.
> 
> Thanks,
> Mark.
> 
> [1] 
> https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git/commit/?h=atomics/api-unification&id=c6b9ff2627d06776e427a7f1a7f83caeff3db536

linux-kernel@vger.kernel.org

2018-05-24 Thread Andrea Parri

> Yeah, lemme put some details here:
> 
> So we have three cases:
> 
> Case #1 (from Will)
> 
>   P0: P1: P2:
> 
>   spin_lock(&slock)   read_lock(&rwlock)
>   write_lock(&rwlock)
>   read_lock(&rwlock)  spin_lock(&slock)
> 
> , which is a deadlock, and couldn't not be detected by lockdep yet. And
> lockdep could detect this with the patch I attach at the end of the
> mail.
> 
> Case #2
> 
>   P0: P1: P2:
> 
>   
>   spin_lock(&slock)   read_lock(&rwlock)
>   write_lock(&rwlock)
>   read_lock(&rwlock)  spin_lock_irq(&slock)
> 
> , which is not a deadlock, as the read_lock() on P0 can use the unfair
> fastpass.
> 
> Case #3
> 
>   P0: P1: P2:
> 
>   
>   spin_lock_irq(&slock)   read_lock(&rwlock)
>   write_lock_irq(&rwlock)
>   read_lock(&rwlock)  spin_lock(&slock)
> 
> , which is a deadlock, as the read_lock() on P0 cannot use the fastpass.

Mmh, I'm starting to think that, maybe, we need a model (a tool) to
distinguish these and other(?) cases (sorry, I could not resist ;-)

[...]


> --->8
> Subject: [PATCH] locking: More accurate annotations for read_lock()
> 
> On the archs using QUEUED_RWLOCKS, read_lock() is not always a recursive
> read lock, actually it's only recursive if in_interrupt() is true. So

Mmh, taking the "common denominator" over archs/Kconfig options and
CPU states, this would suggest that read_lock() is non-recursive;

it looks like I can say "good-bye" to my idea to define (formalize)
consistent executions/the memory ordering of RW-LOCKS "by following"
the following _emulation_:

void read_lock(rwlock_t *s)
{
r0 = atomic_fetch_inc_acquire(&s->val);
}

void read_unlock(rwlock_t *s)
{
r0 = atomic_fetch_sub_release(&s->val);
}

void write_lock(rwlock_t *s)
{
r0 = atomic_cmpxchg_acquire(&s->val, 0, -1);
}

void write_unlock(rwlock_t *s)
{
atomic_set_release(&s->val, 0);
}

filter (~read_lock:r0=-1 /\ write_lock:r0=0)

[...]


> The code is done, I'm just working on the rework for documention stuff,
> so if anyone is interested, could try it out ;-)

Any idea on how to "educate" the LKMM about this code/documentation?

  Andrea

Re: [PATCH v9 1/2] arch/*: Add CONFIG_ARCH_HAVE_CMPXCHG64

2018-05-15 Thread Andrea Parri

Hi Bart,

On Mon, May 14, 2018 at 11:46:33AM -0700, Bart Van Assche wrote:

[...]

> diff --git a/Documentation/features/locking/cmpxchg64/arch-support.txt 
> b/Documentation/features/locking/cmpxchg64/arch-support.txt
> new file mode 100644
> index ..65b3290ce5d5
> --- /dev/null
> +++ b/Documentation/features/locking/cmpxchg64/arch-support.txt
> @@ -0,0 +1,31 @@
> +#
> +# Feature name:  cmpxchg64
> +# Kconfig:   ARCH_HAVE_CMPXCHG64
> +# description:   arch supports the cmpxchg64() API
> +#
> +---
> +| arch |status|
> +---
> +|   alpha: |  ok  |
> +| arc: | TODO |
> +| arm: |!thumb|
> +|   arm64: |  ok  |
> +| c6x: | TODO |
> +|   h8300: | TODO |
> +| hexagon: | TODO |
> +|ia64: |  ok  |
> +|m68k: |  ok  |
> +|  microblaze: | TODO |
> +|mips: |64-bit|
> +|   nios2: | TODO |
> +|openrisc: | TODO |
> +|  parisc: |  ok  |
> +| powerpc: |64-bit|
> +|s390: |  ok  |
> +|  sh: | TODO |
> +|   sparc: |  ok  |
> +|  um: | TODO |
> +|   unicore32: | TODO |
> +| x86: |  ok  |
> +|  xtensa: |  ok  |
> +---

nds32 and riscv seem to be missing from the table. I'd also suggest
sticking to the three entries documented in

  Documentation/features/arch-support.txt
  
and using the header comment to provide any additional information.

A script that refreshes the arch support status file in place (from
the Kconfig files) is currently available in linux-next: c.f.,

  Documentation/features/scripts/features-refresh.sh

  Andrea

[tip:locking/core] tools/memory-model: Update ASPLOS information

2018-05-14 Thread tip-bot for Andrea Parri

Commit-ID:  1a00b4554d477f05199e22ee71ba4c2525ca44cb
Gitweb: https://git.kernel.org/tip/1a00b4554d477f05199e22ee71ba4c2525ca44cb
Author: Andrea Parri 
AuthorDate: Mon, 14 May 2018 16:33:56 -0700
Committer:  Ingo Molnar 
CommitDate: Tue, 15 May 2018 08:11:18 +0200

tools/memory-model: Update ASPLOS information

ASPLOS 2018 was held in March: make sure this is reflected in
header comments and references.

Signed-off-by: Andrea Parri 
Signed-off-by: Paul E. McKenney 
Cc: Akira Yokosawa 
Cc: Alan Stern 
Cc: Andrew Morton 
Cc: Boqun Feng 
Cc: David Howells 
Cc: Jade Alglave 
Cc: Linus Torvalds 
Cc: Luc Maranget 
Cc: Nicholas Piggin 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Will Deacon 
Cc: linux-a...@vger.kernel.org
Cc: parri.and...@gmail.com
Link: 
http://lkml.kernel.org/r/1526340837-1-18-git-send-email-paul...@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar 
---
 tools/memory-model/Documentation/references.txt | 11 ++-
 tools/memory-model/linux-kernel.bell|  4 ++--
 tools/memory-model/linux-kernel.cat |  4 ++--
 tools/memory-model/linux-kernel.def |  4 ++--
 4 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/tools/memory-model/Documentation/references.txt 
b/tools/memory-model/Documentation/references.txt
index ba2e34c2ec3f..74f448f2616a 100644
--- a/tools/memory-model/Documentation/references.txt
+++ b/tools/memory-model/Documentation/references.txt
@@ -67,11 +67,12 @@ o   Shaked Flur, Susmit Sarkar, Christopher Pulte, Kyndylan 
Nienhuis,
 Linux-kernel memory model
 =
 
-o  Andrea Parri, Alan Stern, Luc Maranget, Paul E. McKenney,
-   and Jade Alglave.  2017. "A formal model of
-   Linux-kernel memory ordering - companion webpage".
-   http://moscova.inria.fr/∼maranget/cats7/linux/. (2017). [Online;
-   accessed 30-January-2017].
+o  Jade Alglave, Luc Maranget, Paul E. McKenney, Andrea Parri, and
+   Alan Stern.  2018. "Frightening small children and disconcerting
+   grown-ups: Concurrency in the Linux kernel". In Proceedings of
+   the 23rd International Conference on Architectural Support for
+   Programming Languages and Operating Systems (ASPLOS 2018). ACM,
+   New York, NY, USA, 405-418.  Webpage: http://diy.inria.fr/linux/.
 
 o  Jade Alglave, Luc Maranget, Paul E. McKenney, Andrea Parri, and
Alan Stern.  2017.  "A formal kernel memory-ordering model (part 1)"
diff --git a/tools/memory-model/linux-kernel.bell 
b/tools/memory-model/linux-kernel.bell
index 432c7cf71b23..64f5740e0e75 100644
--- a/tools/memory-model/linux-kernel.bell
+++ b/tools/memory-model/linux-kernel.bell
@@ -5,10 +5,10 @@
  * Copyright (C) 2017 Alan Stern ,
  *    Andrea Parri 
  *
- * An earlier version of this file appears in the companion webpage for
+ * An earlier version of this file appeared in the companion webpage for
  * "Frightening small children and disconcerting grown-ups: Concurrency
  * in the Linux kernel" by Alglave, Maranget, McKenney, Parri, and Stern,
- * which is to appear in ASPLOS 2018.
+ * which appeared in ASPLOS 2018.
  *)
 
 "Linux-kernel memory consistency model"
diff --git a/tools/memory-model/linux-kernel.cat 
b/tools/memory-model/linux-kernel.cat
index 1e5c4653dd12..59b5cbe6b624 100644
--- a/tools/memory-model/linux-kernel.cat
+++ b/tools/memory-model/linux-kernel.cat
@@ -5,10 +5,10 @@
  * Copyright (C) 2017 Alan Stern ,
  *Andrea Parri 
  *
- * An earlier version of this file appears in the companion webpage for
+ * An earlier version of this file appeared in the companion webpage for
  * "Frightening small children and disconcerting grown-ups: Concurrency
  * in the Linux kernel" by Alglave, Maranget, McKenney, Parri, and Stern,
- * which is to appear in ASPLOS 2018.
+ * which appeared in ASPLOS 2018.
  *)
 
 "Linux-kernel memory consistency model"
diff --git a/tools/memory-model/linux-kernel.def 
b/tools/memory-model/linux-kernel.def
index f0553bd37c08..6fa3eb28d40b 100644
--- a/tools/memory-model/linux-kernel.def
+++ b/tools/memory-model/linux-kernel.def
@@ -1,9 +1,9 @@
 // SPDX-License-Identifier: GPL-2.0+
 //
-// An earlier version of this file appears in the companion webpage for
+// An earlier version of this file appeared in the companion webpage for
 // "Frightening small children and disconcerting grown-ups: Concurrency
 // in the Linux kernel" by Alglave, Maranget, McKenney, Parri, and Stern,
-// which is to appear in ASPLOS 2018.
+// which appeared in ASPLOS 2018.
 
 // ONCE
 READ_ONCE(X) __load{once}(X)

[tip:locking/core] tools/memory-model: Add reference for 'Simplifying ARM concurrency'

2018-05-14 Thread tip-bot for Andrea Parri

Commit-ID:  99c12749b172758f6973fc023484f2fc8b91cd5a
Gitweb: https://git.kernel.org/tip/99c12749b172758f6973fc023484f2fc8b91cd5a
Author: Andrea Parri 
AuthorDate: Mon, 14 May 2018 16:33:57 -0700
Committer:  Ingo Molnar 
CommitDate: Tue, 15 May 2018 08:11:19 +0200

tools/memory-model: Add reference for 'Simplifying ARM concurrency'

The paper discusses the revised ARMv8 memory model; such revision
had an important impact on the design of the LKMM.

Signed-off-by: Andrea Parri 
Signed-off-by: Paul E. McKenney 
Cc: Akira Yokosawa 
Cc: Alan Stern 
Cc: Andrew Morton 
Cc: Boqun Feng 
Cc: David Howells 
Cc: Jade Alglave 
Cc: Linus Torvalds 
Cc: Luc Maranget 
Cc: Nicholas Piggin 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Will Deacon 
Cc: linux-a...@vger.kernel.org
Cc: parri.and...@gmail.com
Link: 
http://lkml.kernel.org/r/1526340837-1-19-git-send-email-paul...@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar 
---
 tools/memory-model/Documentation/references.txt | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/tools/memory-model/Documentation/references.txt 
b/tools/memory-model/Documentation/references.txt
index 74f448f2616a..b177f3e4a614 100644
--- a/tools/memory-model/Documentation/references.txt
+++ b/tools/memory-model/Documentation/references.txt
@@ -63,6 +63,12 @@ oShaked Flur, Susmit Sarkar, Christopher Pulte, Kyndylan 
Nienhuis,
Principles of Programming Languages (POPL 2017). ACM, New York,
NY, USA, 429–442.
 
+o  Christopher Pulte, Shaked Flur, Will Deacon, Jon French,
+   Susmit Sarkar, and Peter Sewell. 2018. "Simplifying ARM concurrency:
+   multicopy-atomic axiomatic and operational models for ARMv8". In
+   Proceedings of the ACM on Programming Languages, Volume 2, Issue
+   POPL, Article No. 19. ACM, New York, NY, USA.
+
 
 Linux-kernel memory model
 =

[tip:locking/core] MAINTAINERS, tools/memory-model: Update e-mail address for Andrea Parri

2018-05-14 Thread tip-bot for Andrea Parri

Commit-ID:  5ccdb7536ebec7a5f8a3883ba1985a80cec80dd3
Gitweb: https://git.kernel.org/tip/5ccdb7536ebec7a5f8a3883ba1985a80cec80dd3
Author: Andrea Parri 
AuthorDate: Mon, 14 May 2018 16:33:55 -0700
Committer:  Ingo Molnar 
CommitDate: Tue, 15 May 2018 08:11:18 +0200

MAINTAINERS, tools/memory-model: Update e-mail address for Andrea Parri

I moved to Amarula Solutions; switch to work e-mail address.

Signed-off-by: Andrea Parri 
Signed-off-by: Paul E. McKenney 
Cc: Akira Yokosawa 
Cc: Alan Stern 
Cc: Andrew Morton 
Cc: Boqun Feng 
Cc: David Howells 
Cc: Jade Alglave 
Cc: Linus Torvalds 
Cc: Luc Maranget 
Cc: Nicholas Piggin 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Will Deacon 
Cc: linux-a...@vger.kernel.org
Cc: parri.and...@gmail.com
Link: 
http://lkml.kernel.org/r/1526340837-1-17-git-send-email-paul...@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar 
---
 MAINTAINERS | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 649e782e4415..b6341e8a3587 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8203,7 +8203,7 @@ F:drivers/misc/lkdtm/*
 
 LINUX KERNEL MEMORY CONSISTENCY MODEL (LKMM)
 M: Alan Stern 
-M: Andrea Parri 
+M: Andrea Parri 
 M: Will Deacon 
 M: Peter Zijlstra 
 M: Boqun Feng

[tip:locking/core] tools/memory-model: Fix coding style in 'lock.cat'

2018-05-14 Thread tip-bot for Andrea Parri

Commit-ID:  05604e7e3adbd78f074b7f86b14f50888bf66252
Gitweb: https://git.kernel.org/tip/05604e7e3adbd78f074b7f86b14f50888bf66252
Author: Andrea Parri 
AuthorDate: Mon, 14 May 2018 16:33:54 -0700
Committer:  Ingo Molnar 
CommitDate: Tue, 15 May 2018 08:11:18 +0200

tools/memory-model: Fix coding style in 'lock.cat'

This commit uses tabs for indentation and adds spaces around binary
operator.

Signed-off-by: Andrea Parri 
Signed-off-by: Paul E. McKenney 
Cc: Andrew Morton 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Will Deacon 
Cc: aki...@gmail.com
Cc: boqun.f...@gmail.com
Cc: dhowe...@redhat.com
Cc: j.algl...@ucl.ac.uk
Cc: linux-a...@vger.kernel.org
Cc: luc.maran...@inria.fr
Cc: npig...@gmail.com
Cc: parri.and...@gmail.com
Cc: st...@rowland.harvard.edu
Link: 
http://lkml.kernel.org/r/1526340837-1-16-git-send-email-paul...@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar 
---
 tools/memory-model/lock.cat | 28 ++--
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/tools/memory-model/lock.cat b/tools/memory-model/lock.cat
index cd002a33ca8a..305ded17e741 100644
--- a/tools/memory-model/lock.cat
+++ b/tools/memory-model/lock.cat
@@ -84,16 +84,16 @@ let rfi-lf = ([LKW] ; po-loc ; [LF]) \ ([LKW] ; po-loc ; 
[UL] ; po-loc)
 
 (* rfe for LF events *)
 let all-possible-rfe-lf =
-  (*
-   * Given an LF event r, compute the possible rfe edges for that event
-   * (all those starting from LKW events in other threads),
-   * and then convert that relation to a set of single-edge relations.
-   *)
-  let possible-rfe-lf r =
-let pair-to-relation p = p ++ 0
-in map pair-to-relation ((LKW * {r}) & loc & ext)
-  (* Do this for each LF event r that isn't in rfi-lf *)
-  in map possible-rfe-lf (LF \ range(rfi-lf))
+   (*
+* Given an LF event r, compute the possible rfe edges for that event
+* (all those starting from LKW events in other threads),
+* and then convert that relation to a set of single-edge relations.
+*)
+   let possible-rfe-lf r =
+   let pair-to-relation p = p ++ 0
+   in map pair-to-relation ((LKW * {r}) & loc & ext)
+   (* Do this for each LF event r that isn't in rfi-lf *)
+   in map possible-rfe-lf (LF \ range(rfi-lf))
 
 (* Generate all rf relations for LF events *)
 with rfe-lf from cross(all-possible-rfe-lf)
@@ -110,10 +110,10 @@ let rfi-ru = ([UL] ; po-loc ; [RU]) \ ([UL] ; po-loc ; 
[LKW] ; po-loc)
 
 (* rfe for RU events: an RU may read from an external UL or the initial write 
*)
 let all-possible-rfe-ru =
-   let possible-rfe-ru r =
- let pair-to-relation p = p ++ 0
- in map pair-to-relation (((UL|IW) * {r}) & loc & ext)
-  in map possible-rfe-ru RU
+   let possible-rfe-ru r =
+   let pair-to-relation p = p ++ 0
+   in map pair-to-relation (((UL | IW) * {r}) & loc & ext)
+   in map possible-rfe-ru RU
 
 (* Generate all rf relations for RU events *)
 with rfe-ru from cross(all-possible-rfe-ru)

[tip:locking/core] tools/memory-model: Model 'smp_store_mb()'

2018-05-14 Thread tip-bot for Andrea Parri

Commit-ID:  bf8c6d963d16d40fbe70e94b61d9bf18c455fc6b
Gitweb: https://git.kernel.org/tip/bf8c6d963d16d40fbe70e94b61d9bf18c455fc6b
Author: Andrea Parri 
AuthorDate: Mon, 14 May 2018 16:33:45 -0700
Committer:  Ingo Molnar 
CommitDate: Tue, 15 May 2018 08:11:16 +0200

tools/memory-model: Model 'smp_store_mb()'

This commit models 'smp_store_mb(x, val);' to be semantically equivalent
to 'WRITE_ONCE(x, val); smp_mb();'.

Suggested-by: Paolo Bonzini 
Suggested-by: Peter Zijlstra 
Signed-off-by: Andrea Parri 
Signed-off-by: Paul E. McKenney 
Acked-by: Alan Stern 
Cc: Andrew Morton 
Cc: Linus Torvalds 
Cc: Thomas Gleixner 
Cc: Will Deacon 
Cc: aki...@gmail.com
Cc: boqun.f...@gmail.com
Cc: dhowe...@redhat.com
Cc: j.algl...@ucl.ac.uk
Cc: linux-a...@vger.kernel.org
Cc: luc.maran...@inria.fr
Cc: npig...@gmail.com
Cc: parri.and...@gmail.com
Link: 
http://lkml.kernel.org/r/1526340837-1-7-git-send-email-paul...@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar 
---
 tools/memory-model/linux-kernel.def | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/memory-model/linux-kernel.def 
b/tools/memory-model/linux-kernel.def
index 397e4e67e8c8..acf86f6f360a 100644
--- a/tools/memory-model/linux-kernel.def
+++ b/tools/memory-model/linux-kernel.def
@@ -14,6 +14,7 @@ smp_store_release(X,V) { __store{release}(*X,V); }
 smp_load_acquire(X) __load{acquire}(*X)
 rcu_assign_pointer(X,V) { __store{release}(X,V); }
 rcu_dereference(X) __load{once}(X)
+smp_store_mb(X,V) { __store{once}(X,V); __fence{mb}; }
 
 // Fences
 smp_mb() { __fence{mb} ; }

[tip:locking/core] tools/memory-model: Fix coding style in 'linux-kernel.def'

2018-05-14 Thread tip-bot for Andrea Parri

Commit-ID:  d17013e0bac66bb4d1be44f061754c7e53292b64
Gitweb: https://git.kernel.org/tip/d17013e0bac66bb4d1be44f061754c7e53292b64
Author: Andrea Parri 
AuthorDate: Mon, 14 May 2018 16:33:46 -0700
Committer:  Ingo Molnar 
CommitDate: Tue, 15 May 2018 08:11:17 +0200

tools/memory-model: Fix coding style in 'linux-kernel.def'

This commit fixes white spaces around semicolons.

Signed-off-by: Andrea Parri 
Signed-off-by: Paul E. McKenney 
Cc: Andrew Morton 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Will Deacon 
Cc: aki...@gmail.com
Cc: boqun.f...@gmail.com
Cc: dhowe...@redhat.com
Cc: j.algl...@ucl.ac.uk
Cc: linux-a...@vger.kernel.org
Cc: luc.maran...@inria.fr
Cc: npig...@gmail.com
Cc: parri.and...@gmail.com
Cc: st...@rowland.harvard.edu
Link: 
http://lkml.kernel.org/r/1526340837-1-8-git-send-email-paul...@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar 
---
 tools/memory-model/linux-kernel.def | 28 ++--
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/tools/memory-model/linux-kernel.def 
b/tools/memory-model/linux-kernel.def
index acf86f6f360a..6bd3bc431b3d 100644
--- a/tools/memory-model/linux-kernel.def
+++ b/tools/memory-model/linux-kernel.def
@@ -17,12 +17,12 @@ rcu_dereference(X) __load{once}(X)
 smp_store_mb(X,V) { __store{once}(X,V); __fence{mb}; }
 
 // Fences
-smp_mb() { __fence{mb} ; }
-smp_rmb() { __fence{rmb} ; }
-smp_wmb() { __fence{wmb} ; }
-smp_mb__before_atomic() { __fence{before-atomic} ; }
-smp_mb__after_atomic() { __fence{after-atomic} ; }
-smp_mb__after_spinlock() { __fence{after-spinlock} ; }
+smp_mb() { __fence{mb}; }
+smp_rmb() { __fence{rmb}; }
+smp_wmb() { __fence{wmb}; }
+smp_mb__before_atomic() { __fence{before-atomic}; }
+smp_mb__after_atomic() { __fence{after-atomic}; }
+smp_mb__after_spinlock() { __fence{after-spinlock}; }
 
 // Exchange
 xchg(X,V)  __xchg{mb}(X,V)
@@ -35,26 +35,26 @@ cmpxchg_acquire(X,V,W) __cmpxchg{acquire}(X,V,W)
 cmpxchg_release(X,V,W) __cmpxchg{release}(X,V,W)
 
 // Spinlocks
-spin_lock(X) { __lock(X) ; }
-spin_unlock(X) { __unlock(X) ; }
+spin_lock(X) { __lock(X); }
+spin_unlock(X) { __unlock(X); }
 spin_trylock(X) __trylock(X)
 
 // RCU
 rcu_read_lock() { __fence{rcu-lock}; }
-rcu_read_unlock() { __fence{rcu-unlock};}
+rcu_read_unlock() { __fence{rcu-unlock}; }
 synchronize_rcu() { __fence{sync-rcu}; }
 synchronize_rcu_expedited() { __fence{sync-rcu}; }
 
 // Atomic
 atomic_read(X) READ_ONCE(*X)
-atomic_set(X,V) { WRITE_ONCE(*X,V) ; }
+atomic_set(X,V) { WRITE_ONCE(*X,V); }
 atomic_read_acquire(X) smp_load_acquire(X)
 atomic_set_release(X,V) { smp_store_release(X,V); }
 
-atomic_add(V,X) { __atomic_op(X,+,V) ; }
-atomic_sub(V,X) { __atomic_op(X,-,V) ; }
-atomic_inc(X)   { __atomic_op(X,+,1) ; }
-atomic_dec(X)   { __atomic_op(X,-,1) ; }
+atomic_add(V,X) { __atomic_op(X,+,V); }
+atomic_sub(V,X) { __atomic_op(X,-,V); }
+atomic_inc(X)   { __atomic_op(X,+,1); }
+atomic_dec(X)   { __atomic_op(X,-,1); }
 
 atomic_add_return(V,X) __atomic_op_return{mb}(X,+,V)
 atomic_add_return_relaxed(V,X) __atomic_op_return{once}(X,+,V)

[tip:locking/core] locking/spinlocks: Clean up comment and #ifndef for {,queued_}spin_is_locked()

2018-05-14 Thread tip-bot for Andrea Parri

Commit-ID:  1362ae43c503a4e333ab6948fc4c6e0e794e1558
Gitweb: https://git.kernel.org/tip/1362ae43c503a4e333ab6948fc4c6e0e794e1558
Author: Andrea Parri 
AuthorDate: Mon, 14 May 2018 16:01:29 -0700
Committer:  Ingo Molnar 
CommitDate: Tue, 15 May 2018 08:11:15 +0200

locking/spinlocks: Clean up comment and #ifndef for {,queued_}spin_is_locked()

Removes "#ifndef queued_spin_is_locked" from the generic code: this is
unused and it's reasonable to conclude that it will continue to be unused.

Also removes the comment about spin_is_locked() from mutex_is_locked():
the comment remains valid but not particularly useful.

Suggested-by: Will Deacon 
Signed-off-by: Andrea Parri 
Signed-off-by: Paul E. McKenney 
Acked-by: Will Deacon 
Cc: Andrew Morton 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: aki...@gmail.com
Cc: boqun.f...@gmail.com
Cc: dhowe...@redhat.com
Cc: j.algl...@ucl.ac.uk
Cc: linux-a...@vger.kernel.org
Cc: luc.maran...@inria.fr
Cc: npig...@gmail.com
Cc: parri.and...@gmail.com
Cc: st...@rowland.harvard.edu
Link: 
http://lkml.kernel.org/r/1526338889-7003-3-git-send-email-paul...@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar 
---
 include/asm-generic/qspinlock.h | 2 --
 include/linux/mutex.h   | 3 ---
 2 files changed, 5 deletions(-)

diff --git a/include/asm-generic/qspinlock.h b/include/asm-generic/qspinlock.h
index a8ed0a352d75..9cc457597ddf 100644
--- a/include/asm-generic/qspinlock.h
+++ b/include/asm-generic/qspinlock.h
@@ -26,7 +26,6 @@
  * @lock: Pointer to queued spinlock structure
  * Return: 1 if it is locked, 0 otherwise
  */
-#ifndef queued_spin_is_locked
 static __always_inline int queued_spin_is_locked(struct qspinlock *lock)
 {
/*
@@ -35,7 +34,6 @@ static __always_inline int queued_spin_is_locked(struct 
qspinlock *lock)
 */
return atomic_read(&lock->val);
 }
-#endif
 
 /**
  * queued_spin_value_unlocked - is the spinlock structure unlocked?
diff --git a/include/linux/mutex.h b/include/linux/mutex.h
index 14bc0d5d0ee5..3093dd162424 100644
--- a/include/linux/mutex.h
+++ b/include/linux/mutex.h
@@ -146,9 +146,6 @@ extern void __mutex_init(struct mutex *lock, const char 
*name,
  */
 static inline bool mutex_is_locked(struct mutex *lock)
 {
-   /*
-* XXX think about spin_is_locked
-*/
return __mutex_owner(lock) != NULL;
 }

[tip:locking/core] locking/spinlocks/arm64: Remove smp_mb() from arch_spin_is_locked()

2018-05-14 Thread tip-bot for Andrea Parri

Commit-ID:  c6f5d02b6a0fb91be5d656885ce02cf28952181d
Gitweb: https://git.kernel.org/tip/c6f5d02b6a0fb91be5d656885ce02cf28952181d
Author: Andrea Parri 
AuthorDate: Mon, 14 May 2018 16:01:28 -0700
Committer:  Ingo Molnar 
CommitDate: Tue, 15 May 2018 08:11:15 +0200

locking/spinlocks/arm64: Remove smp_mb() from arch_spin_is_locked()

The following commit:

  38b850a73034f ("arm64: spinlock: order spin_{is_locked,unlock_wait} against 
local locks")

... added an smp_mb() to arch_spin_is_locked(), in order
"to ensure that the lock value is always loaded after any other locks have
been taken by the current CPU", and reported one example (the "insane case"
in ipc/sem.c) relying on such guarantee.

It is however understood that spin_is_locked() is not required to provide
such an ordering guarantee (a guarantee that is currently not provided by
all the implementations/archs), and that callers relying on such ordering
should instead insert suitable memory barriers before acting on the result
of spin_is_locked().

Following a recent auditing [1] of the callers of {,raw_}spin_is_locked(),
revealing that none of them are relying on the ordering guarantee anymore,
this commit removes the leading smp_mb() from the primitive thus reverting
38b850a73034f.

[1] https://marc.info/?l=linux-kernel&m=151981440005264&w=2
https://marc.info/?l=linux-kernel&m=152042843808540&w=2
https://marc.info/?l=linux-kernel&m=152043346110262&w=2

Signed-off-by: Andrea Parri 
Signed-off-by: Paul E. McKenney 
Acked-by: Will Deacon 
Cc: Andrew Morton 
Cc: Catalin Marinas 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: aki...@gmail.com
Cc: boqun.f...@gmail.com
Cc: dhowe...@redhat.com
Cc: j.algl...@ucl.ac.uk
Cc: linux-a...@vger.kernel.org
Cc: luc.maran...@inria.fr
Cc: npig...@gmail.com
Cc: parri.and...@gmail.com
Cc: st...@rowland.harvard.edu
Link: 
http://lkml.kernel.org/r/1526338889-7003-2-git-send-email-paul...@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar 
---
 arch/arm64/include/asm/spinlock.h | 5 -
 1 file changed, 5 deletions(-)

diff --git a/arch/arm64/include/asm/spinlock.h 
b/arch/arm64/include/asm/spinlock.h
index ebdae15d665d..26c5bd7d88d8 100644
--- a/arch/arm64/include/asm/spinlock.h
+++ b/arch/arm64/include/asm/spinlock.h
@@ -122,11 +122,6 @@ static inline int arch_spin_value_unlocked(arch_spinlock_t 
lock)
 
 static inline int arch_spin_is_locked(arch_spinlock_t *lock)
 {
-   /*
-* Ensure prior spin_lock operations to other locks have completed
-* on this CPU before we test whether "lock" is locked.
-*/
-   smp_mb(); /* ^^^ */
return !arch_spin_value_unlocked(READ_ONCE(*lock));
 }

[tip:locking/core] locking/spinlocks: Document the semantics of spin_is_locked()

2018-05-14 Thread tip-bot for Andrea Parri

Commit-ID:  b7e4aadef28f217de8907eec60a964328797a2be
Gitweb: https://git.kernel.org/tip/b7e4aadef28f217de8907eec60a964328797a2be
Author: Andrea Parri 
AuthorDate: Mon, 14 May 2018 16:01:27 -0700
Committer:  Ingo Molnar 
CommitDate: Tue, 15 May 2018 08:11:15 +0200

locking/spinlocks: Document the semantics of spin_is_locked()

There appeared to be a certain, recurrent uncertainty concerning the
semantics of spin_is_locked(), likely a consequence of the fact that
this semantics remains undocumented or that it has been historically
linked to the (likewise unclear) semantics of spin_unlock_wait().

A recent auditing [1] of the callers of the primitive confirmed that
none of them are relying on particular ordering guarantees; document
this semantics by adding a docbook header to spin_is_locked(). Also,
describe behaviors specific to certain CONFIG_SMP=n builds.

[1] https://marc.info/?l=linux-kernel&m=151981440005264&w=2
https://marc.info/?l=linux-kernel&m=152042843808540&w=2
https://marc.info/?l=linux-kernel&m=152043346110262&w=2

Co-Developed-by: Andrea Parri 
Co-Developed-by: Alan Stern 
Co-Developed-by: David Howells 
Signed-off-by: Andrea Parri 
Signed-off-by: Alan Stern 
Signed-off-by: David Howells 
Signed-off-by: Paul E. McKenney 
Acked-by: Randy Dunlap 
Cc: Akira Yokosawa 
Cc: Andrew Morton 
Cc: Boqun Feng 
Cc: Jade Alglave 
Cc: Linus Torvalds 
Cc: Luc Maranget 
Cc: Nicholas Piggin 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Will Deacon 
Cc: linux-a...@vger.kernel.org
Cc: parri.and...@gmail.com
Link: 
http://lkml.kernel.org/r/1526338889-7003-1-git-send-email-paul...@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar 
---
 include/linux/spinlock.h | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/include/linux/spinlock.h b/include/linux/spinlock.h
index 4894d322d258..1e8a46435838 100644
--- a/include/linux/spinlock.h
+++ b/include/linux/spinlock.h
@@ -380,6 +380,24 @@ static __always_inline int spin_trylock_irq(spinlock_t 
*lock)
raw_spin_trylock_irqsave(spinlock_check(lock), flags); \
 })
 
+/**
+ * spin_is_locked() - Check whether a spinlock is locked.
+ * @lock: Pointer to the spinlock.
+ *
+ * This function is NOT required to provide any memory ordering
+ * guarantees; it could be used for debugging purposes or, when
+ * additional synchronization is needed, accompanied with other
+ * constructs (memory barriers) enforcing the synchronization.
+ *
+ * Returns: 1 if @lock is locked, 0 otherwise.
+ *
+ * Note that the function only tells you that the spinlock is
+ * seen to be locked, not that it is locked on your CPU.
+ *
+ * Further, on CONFIG_SMP=n builds with CONFIG_DEBUG_SPINLOCK=n,
+ * the return value is always 0 (see include/linux/spinlock_up.h).
+ * Therefore you should not rely heavily on the return value.
+ */
 static __always_inline int spin_is_locked(spinlock_t *lock)
 {
return raw_spin_is_locked(&lock->rlock);

Re: [PATCH 13/18] wait: wait.h: Get rid of a kernel-doc/Sphinx warnings

2018-05-10 Thread Andrea Parri

On Thu, May 10, 2018 at 07:15:59AM -0600, Jonathan Corbet wrote:
> On Thu, 10 May 2018 14:23:35 +0200
> Andrea Parri  wrote:
> 
> > only
> > remember that other people (including some developers running into the
> > "disadventure" of opening an RST doc. from their preferred text editor
> > and being brought to conclude:  "WTH!  I need to open a web browser, I
> > guess...") _use_ such doc. and _do care_ about it, and that what might
> > be an improvement for some people might look as "vandalizing" to others.
> 
> If you have an example of a place where use of a web browser has been
> made mandatory, please point it out.  Avoiding that was at the top of the
> list of explicit requirements.

That's all I need.


>Surely an extra colon is not going to
> force you to run screaming to the protective embrace of Firefox...?

Let me put it in these terms: I believe that that extra colon (or the
"diagram" keywork) is not going to improve/help my use of the doc. ;D

  Andrea


> 
> Thanks,
> 
> jon

Re: [PATCH 13/18] wait: wait.h: Get rid of a kernel-doc/Sphinx warnings

2018-05-10 Thread Andrea Parri

On Wed, May 09, 2018 at 08:45:18AM -0600, Jonathan Corbet wrote:
> On Wed, 9 May 2018 10:41:20 +0200
> Peter Zijlstra  wrote:
> 
> > > This is easily done by using "::" instead of just ":".  
> > 
> > And I'll voice my objection once again. This makes a regular comment
> > worse. This rst stuff is utter shit for making normal text files less
> > readable in your favourite text editor.
> > 
> > If this gets merged, I'll simply remove that spurious ':' the next time
> > I'm near that comment.
> 
> Seriously, Peter?
> 
> It's a simple colon.  It goes along with the /** marker for kerneldoc
> comments and the @ markers found within them, both of which you seem to
> have found a way to live with.
> 
> The RST work was discussed for a year before we even started.  It has
> brought in the efforts of a large number of developers, all of whom see
> the value in actually caring about our documentation and making it
> accessible to a much larger group of readers.  And it has all happened
> while preserving the primacy of the plain-text documentation.
> 
> You're not the only consumer of the docs.  You may not appreciate the
> improvements that have come, but others certainly do.  I do hope that you
> can find it in youself to avoid vandalizing things for everybody else ...?

You wrote it:  the fact that some people (including its developers) see
a value in the RST work or the fact that such work made the kernel doc.
accessible to a larger group of readers are not in question here;  only
remember that other people (including some developers running into the
"disadventure" of opening an RST doc. from their preferred text editor
and being brought to conclude:  "WTH!  I need to open a web browser, I
guess...") _use_ such doc. and _do care_ about it, and that what might
be an improvement for some people might look as "vandalizing" to others.

We're talking about readability/accessibility here, but I think similar
considerations apply to other aspects of the doc. such as availability/
completeness (yes, I did hear developers arguing "I won't write such a
doc., because...") and consistency (w.r.t. the doc. itself and sources).

  Andrea

> 
> Thanks,
> 
> jon

Re: linux-next: manual merge of the akpm-current tree with the jc_docs tree

2018-05-09 Thread Andrea Parri

On Wed, May 09, 2018 at 11:11:36AM -0600, Jonathan Corbet wrote:
> On Wed, 9 May 2018 18:53:28 +0200
> Andrea Parri  wrote:
> 
> > > Now that I look a little closer, I think the real issue is that the
> > > "features" documentation assumes that there's a Kconfig option for each,
> > > but there isn't in this case.  The lack of a Kconfig option does not,
> > > this time around, imply that the feature has gone away.
> > > 
> > > I think that I should probably revert this patch in the short term.
> > > Longer-term, it would be good to have an alternative syntax for "variable
> > > set in the arch headers" to describe situations like this.  
> > 
> > Both matters were discussed during v1:
> > 
> >   
> > http://lkml.kernel.org/r/1522774551-9503-1-git-send-email-andrea.pa...@amarulasolutions.com
> > 
> > ... (and the glory details are documented in features-refresh.sh ;-) ).
> 
> So I'll admit to being confused, since I don't see discussion of the
> actual topic at hand.

A couple of clicks on "next in thread"  :-)

  https://marc.info/?l=linux-kernel&m=152284705204400&w=2
  https://marc.info/?l=linux-kernel&m=152294150600751&w=2


> 
> > As I suggested above, simply reverting this patch will leave this file,
> > (and only this file!) out-of-date (and won't resolve the conflict with
> > Laurent's patch ...).
> 
> Reverting this patch retains the updates from earlier in the series, and
> does indeed make the conflict go away, so I'm still confused.  What am I
> missing?

The updates from earlier added "TODO" rows for nds32 and riscv, but missed
the "TODO -> ok" update for riscv.

  Andrea


> 
> Thanks,
> 
> jon

Re: linux-next: manual merge of the akpm-current tree with the jc_docs tree

2018-05-09 Thread Andrea Parri

On Wed, May 09, 2018 at 08:59:20AM -0600, Jonathan Corbet wrote:
> On Wed, 9 May 2018 15:28:24 +0200
> Andrea Parri  wrote:
> 
> > > BTW, it would be nice if the the question "Why was this file removed?" was
> > > answered by that jc_docs commit message ...  I actually wonder if this
> > > file needs to return (I have no way of knowing).  
> > 
> > My bad; thanks for pointing this out.
> > 
> > Mmh... "why" would have been something like "the feature has no Kconfig". 
> > ;-)
> > 
> > I defer to your (community) decision regarding "if this file needs to 
> > return"
> > (Cc-ing Ingo, who created the file and also suggested its removal); I remain
> > available for preparing the patch to restore (and refresh) this file, should
> > you agree with this approach.
> 
> So I'll confess that I balked on the lack of a changelog, but then decided
> to proceed with the patch (and the other removal as well) due to the lack
> of the Kconfig option.
> 
> Now that I look a little closer, I think the real issue is that the
> "features" documentation assumes that there's a Kconfig option for each,
> but there isn't in this case.  The lack of a Kconfig option does not,
> this time around, imply that the feature has gone away.
> 
> I think that I should probably revert this patch in the short term.
> Longer-term, it would be good to have an alternative syntax for "variable
> set in the arch headers" to describe situations like this.

Both matters were discussed during v1:

  
http://lkml.kernel.org/r/1522774551-9503-1-git-send-email-andrea.pa...@amarulasolutions.com

... (and the glory details are documented in features-refresh.sh ;-) ).

As I suggested above, simply reverting this patch will leave this file,
(and only this file!) out-of-date (and won't resolve the conflict with
Laurent's patch ...).

  Andrea


> 
> Make sense?
> 
> Thanks,
> 
> jon

Re: [PATCH v2 08/11] docs: refcount-vs-atomic.rst: prefix url with https

2018-05-09 Thread Andrea Parri

Hi Mauro,

On Wed, May 09, 2018 at 10:18:51AM -0300, Mauro Carvalho Chehab wrote:
> There's a github URL there, but it is not prefixed by https.
> Add a prefix, to avoid false-positives with:
>   ./scripts/documentation-file-ref-check
> 
> As a side effect, Sphinx should also generate a cross-ref.
> 
> Signed-off-by: Mauro Carvalho Chehab 

There seems to be a "conflicting" patch ("applied" according to Jon):


http://lkml.kernel.org/r/1525468309-5310-1-git-send-email-andrea.pa...@amarulasolutions.com

Let me stress here that the github repo. is out-of-date (and we have
no plans to keep that in sync with mainline).

  Andrea


> ---
>  Documentation/core-api/refcount-vs-atomic.rst | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/Documentation/core-api/refcount-vs-atomic.rst 
> b/Documentation/core-api/refcount-vs-atomic.rst
> index 83351c258cdb..185d659e350a 100644
> --- a/Documentation/core-api/refcount-vs-atomic.rst
> +++ b/Documentation/core-api/refcount-vs-atomic.rst
> @@ -17,7 +17,7 @@ in order to help maintainers validate their code against 
> the change in
>  these memory ordering guarantees.
>  
>  The terms used through this document try to follow the formal LKMM defined in
> -github.com/aparri/memory-model/blob/master/Documentation/explanation.txt
> +https://github.com/aparri/memory-model/blob/master/Documentation/explanation.txt
>  
>  memory-barriers.txt and atomic_t.txt provide more background to the
>  memory ordering in general and for atomic operations specifically.
> -- 
> 2.17.0
>

Re: linux-next: manual merge of the akpm-current tree with the jc_docs tree

2018-05-09 Thread Andrea Parri

Really Cc-ing Ingo:

On Wed, May 09, 2018 at 03:28:24PM +0200, Andrea Parri wrote:
> On Wed, May 09, 2018 at 08:25:26PM +1000, Stephen Rothwell wrote:
> > Hi all,
> > 
> > Today's linux-next merge of the akpm-current tree got a conflict in:
> > 
> >   Documentation/features/vm/pte_special/arch-support.txt
> > 
> > between commit:
> > 
> >   2bef69a385b4 ("Documentation/features/vm: Remove arch support status file 
> > for 'pte_special'")
> > 
> > from the jc_docs tree and commit:
> > 
> >   1099dc900e93 ("mm: introduce ARCH_HAS_PTE_SPECIAL")
> > 
> > from the akpm-current tree.
> > 
> > I fixed it up (the former removed the file, so I did that) and can
> > carry the fix as necessary. This is now fixed as far as linux-next is
> > concerned, but any non trivial conflicts should be mentioned to your
> > upstream maintainer when your tree is submitted for merging.  You may
> > also want to consider cooperating with the maintainer of the conflicting
> > tree to minimise any particularly complex conflicts.
> > 
> > BTW, it would be nice if the the question "Why was this file removed?" was
> > answered by that jc_docs commit message ...  I actually wonder if this
> > file needs to return (I have no way of knowing).
> 
> My bad; thanks for pointing this out.
> 
> Mmh... "why" would have been something like "the feature has no Kconfig". ;-)
> 
> I defer to your (community) decision regarding "if this file needs to return"
> (Cc-ing Ingo, who created the file and also suggested its removal); I remain
> available for preparing the patch to restore (and refresh) this file, should
> you agree with this approach.
> 
>   Andrea
> 
> 
> > 
> > -- 
> > Cheers,
> > Stephen Rothwell
> 
>

Re: linux-next: manual merge of the akpm-current tree with the jc_docs tree

2018-05-09 Thread Andrea Parri

On Wed, May 09, 2018 at 08:25:26PM +1000, Stephen Rothwell wrote:
> Hi all,
> 
> Today's linux-next merge of the akpm-current tree got a conflict in:
> 
>   Documentation/features/vm/pte_special/arch-support.txt
> 
> between commit:
> 
>   2bef69a385b4 ("Documentation/features/vm: Remove arch support status file 
> for 'pte_special'")
> 
> from the jc_docs tree and commit:
> 
>   1099dc900e93 ("mm: introduce ARCH_HAS_PTE_SPECIAL")
> 
> from the akpm-current tree.
> 
> I fixed it up (the former removed the file, so I did that) and can
> carry the fix as necessary. This is now fixed as far as linux-next is
> concerned, but any non trivial conflicts should be mentioned to your
> upstream maintainer when your tree is submitted for merging.  You may
> also want to consider cooperating with the maintainer of the conflicting
> tree to minimise any particularly complex conflicts.
> 
> BTW, it would be nice if the the question "Why was this file removed?" was
> answered by that jc_docs commit message ...  I actually wonder if this
> file needs to return (I have no way of knowing).

My bad; thanks for pointing this out.

Mmh... "why" would have been something like "the feature has no Kconfig". ;-)

I defer to your (community) decision regarding "if this file needs to return"
(Cc-ing Ingo, who created the file and also suggested its removal); I remain
available for preparing the patch to restore (and refresh) this file, should
you agree with this approach.

  Andrea


> 
> -- 
> Cheers,
> Stephen Rothwell

Re: [PATCH 05/18] docs: core-api: add cachetlb documentation

2018-05-08 Thread Andrea Parri

On Tue, May 08, 2018 at 03:28:51PM -0300, Mauro Carvalho Chehab wrote:
> Em Tue, 8 May 2018 15:05:07 -0300
> Mauro Carvalho Chehab  escreveu:
> 
> > Em Tue, 08 May 2018 17:40:56 +0300
> > Jani Nikula  escreveu:

[...]

> > > Side note, there's scripts/documentation-file-ref-check to grep the
> > > kernel tree for things that look like file references to Documentation/*
> > > and complain if they don't exist.
> > > 
> > > I get about 350+ hits with that, patches welcome! ;)  
> > 
> > This small script fixes a bunch of such errors:
> > 
> > scripts/documentation-file-ref-check 2>broken_refs
> > for i in $(cat broken_refs|cut -d: -f 2|grep -v 
> > devicetree|sort|uniq|grep \\.txt); do
> > rst=$(basename $i)
> > rst=${rst/.txt/.rst}
> > f=$(find . -name $rst)
> > f=${f#./}
> > if [ "$f" != "" ]; then
> > echo "Replacing $i to $f"
> > for j in $(git grep -l $i); do
> > sed "s@$i@$f@g" -i $j
> > done
> > fi
> > done
> 
> It follows an improvement to the above script that shows also what
> it didn't find as a ReST file, and the ones that have common names
> with multiple matches.
> 
> I guess we could integrate something like that at 
> scripts/documentation-file-ref-check, in order to allow auto-correcting
> renamed .txt files.

FWIW, this would be more than welcome with me; thank you,

  Andrea


> 
> Regards,
> Mauro
> 
> 
> #!/bin/bash
> 
> scripts/documentation-file-ref-check 2>broken_refs
> for i in $(cat broken_refs|cut -d: -f 2|grep -v devicetree|sort|uniq|grep 
> \\.txt); do
> rst=$(basename $i)
> rst=${rst/.txt/.rst}
> f=$(find . -name $rst)
> 
> if [ "$f" == "" ]; then
> echo "ERROR: Didn't find a .rst replacement for $i"
> elif [ "$(echo $f | grep ' ')" != "" ]; then
> echo "ERROR: Found multiple possible replacements for $i:"
> for j in $f; do
> echo "$j"
> done
> else
> echo "Replacing $i to $f"
> f=${f#./}
> for j in $(git grep -l $i); do
> sed "s@$i@$f@g" -i $j
> done
> fi
> done
> 
> 
> Thanks,
> Mauro

Re: [PATCH 05/18] docs: core-api: add cachetlb documentation

2018-05-08 Thread Andrea Parri

On Tue, May 08, 2018 at 10:04:08AM -0600, Jonathan Corbet wrote:
> On Mon,  7 May 2018 06:35:41 -0300
> Mauro Carvalho Chehab  wrote:
> 
> > The cachetlb.txt is already in ReST format. So, move it to the
> > core-api guide, where it belongs.
> > 
> > Signed-off-by: Mauro Carvalho Chehab 
> 
> I think we could do a better job of this by integrating it with the
> kerneldoc comments.  Meanwhile, though, this is a step in the right
> direction, so I've applied it, thanks.

This depends on what you mean by "the right direction": IMO, breaking
in-sources references and get_maintainer.pl does not qualify as such.

  Andrea


> 
> jon

Re: [PATCH 05/18] docs: core-api: add cachetlb documentation

2018-05-08 Thread Andrea Parri

On Tue, May 08, 2018 at 06:02:42PM +0200, Andrea Parri wrote:
> Hi Jani,
> 
> On Tue, May 08, 2018 at 05:40:56PM +0300, Jani Nikula wrote:
> > On Mon, 07 May 2018, Andrea Parri  wrote:
> > > On Mon, May 07, 2018 at 06:35:41AM -0300, Mauro Carvalho Chehab wrote:
> > >> The cachetlb.txt is already in ReST format. So, move it to the
> > >> core-api guide, where it belongs.
> > >> 
> > >> Signed-off-by: Mauro Carvalho Chehab 
> > >> ---
> > >>  Documentation/00-INDEX| 2 --
> > >>  Documentation/{cachetlb.txt => core-api/cachetlb.rst} | 0
> > >>  Documentation/core-api/index.rst  | 1 +
> > >>  Documentation/memory-barriers.txt | 2 +-
> > >>  Documentation/translations/ko_KR/memory-barriers.txt  | 2 +-
> > >>  5 files changed, 3 insertions(+), 4 deletions(-)
> > >>  rename Documentation/{cachetlb.txt => core-api/cachetlb.rst} (100%)
> > >
> > > I see a few "inline" references to the .txt file in -rc4 (see below):
> > > I am not sure if you managed to update them too.
> > 
> > Side note, there's scripts/documentation-file-ref-check to grep the
> > kernel tree for things that look like file references to Documentation/*
> > and complain if they don't exist.
> > 
> > I get about 350+ hits with that, patches welcome! ;)
> 
> Thanks for pointing out the script/results.
> 
> It's also worth stressing, I think, the fact that some of those are from
> the MAINTAINERS file; I stumbled accross one of them yesterday:
> 
>   
> http://lkml.kernel.org/r/1525707655-3542-1-git-send-email-andrea.pa...@amarulasolutions.com
> 
> False positives apart (e.g., the four references in tools/memory-model/),
> those are regressions from my POV: please do not (consiously) merge more!

s/four/five

  Andrea


> 
>   Andrea
> 
> 
> > 
> > 
> > BR,
> > Jani.
> > 
> > 
> > >
> > > ./arch/microblaze/include/asm/cacheflush.h:/* Look at 
> > > Documentation/cachetlb.txt */
> > > ./arch/unicore32/include/asm/cacheflush.h: *  See 
> > > Documentation/cachetlb.txt for more information.
> > > ./arch/arm64/include/asm/cacheflush.h: *  See Documentation/cachetlb.txt 
> > > for more information. Please note that
> > > ./arch/arm/include/asm/cacheflush.h: *See Documentation/cachetlb.txt 
> > > for more information.
> > > ./arch/xtensa/include/asm/cacheflush.h: * (see also 
> > > Documentation/cachetlb.txt)
> > > ./arch/xtensa/include/asm/cacheflush.h:/* This is not required, see 
> > > Documentation/cachetlb.txt */
> > >
> > >   Andrea
> > >
> > >
> > >> 
> > >> diff --git a/Documentation/00-INDEX b/Documentation/00-INDEX
> > >> index 53699c79ee54..04074059bcdc 100644
> > >> --- a/Documentation/00-INDEX
> > >> +++ b/Documentation/00-INDEX
> > >> @@ -76,8 +76,6 @@ bus-devices/
> > >>  - directory with info on TI GPMC (General Purpose Memory 
> > >> Controller)
> > >>  bus-virt-phys-mapping.txt
> > >>  - how to access I/O mapped memory from within device drivers.
> > >> -cachetlb.txt
> > >> -- describes the cache/TLB flushing interfaces Linux uses.
> > >>  cdrom/
> > >>  - directory with information on the CD-ROM drivers that Linux 
> > >> has.
> > >>  cgroup-v1/
> > >> diff --git a/Documentation/cachetlb.txt 
> > >> b/Documentation/core-api/cachetlb.rst
> > >> similarity index 100%
> > >> rename from Documentation/cachetlb.txt
> > >> rename to Documentation/core-api/cachetlb.rst
> > >> diff --git a/Documentation/core-api/index.rst 
> > >> b/Documentation/core-api/index.rst
> > >> index c670a8031786..d4d71ee564ae 100644
> > >> --- a/Documentation/core-api/index.rst
> > >> +++ b/Documentation/core-api/index.rst
> > >> @@ -14,6 +14,7 @@ Core utilities
> > >> kernel-api
> > >> assoc_array
> > >> atomic_ops
> > >> +   cachetlb
> > >> refcount-vs-atomic
> > >> cpu_hotplug
> > >> idr
> > >> diff --git a/Documentation/memory-barriers.txt 
> > >> b/Documentation/memory-barriers.txt
> > >> index 6dafc8085acc..983249906fc6 100644
> > >> --- a/Documentation/memory-barriers.txt
> > >>

Re: [PATCH 05/18] docs: core-api: add cachetlb documentation

2018-05-08 Thread Andrea Parri

Hi Jani,

On Tue, May 08, 2018 at 05:40:56PM +0300, Jani Nikula wrote:
> On Mon, 07 May 2018, Andrea Parri  wrote:
> > On Mon, May 07, 2018 at 06:35:41AM -0300, Mauro Carvalho Chehab wrote:
> >> The cachetlb.txt is already in ReST format. So, move it to the
> >> core-api guide, where it belongs.
> >> 
> >> Signed-off-by: Mauro Carvalho Chehab 
> >> ---
> >>  Documentation/00-INDEX| 2 --
> >>  Documentation/{cachetlb.txt => core-api/cachetlb.rst} | 0
> >>  Documentation/core-api/index.rst  | 1 +
> >>  Documentation/memory-barriers.txt | 2 +-
> >>  Documentation/translations/ko_KR/memory-barriers.txt  | 2 +-
> >>  5 files changed, 3 insertions(+), 4 deletions(-)
> >>  rename Documentation/{cachetlb.txt => core-api/cachetlb.rst} (100%)
> >
> > I see a few "inline" references to the .txt file in -rc4 (see below):
> > I am not sure if you managed to update them too.
> 
> Side note, there's scripts/documentation-file-ref-check to grep the
> kernel tree for things that look like file references to Documentation/*
> and complain if they don't exist.
> 
> I get about 350+ hits with that, patches welcome! ;)

Thanks for pointing out the script/results.

It's also worth stressing, I think, the fact that some of those are from
the MAINTAINERS file; I stumbled accross one of them yesterday:

  
http://lkml.kernel.org/r/1525707655-3542-1-git-send-email-andrea.pa...@amarulasolutions.com

False positives apart (e.g., the four references in tools/memory-model/),
those are regressions from my POV: please do not (consiously) merge more!

  Andrea


> 
> 
> BR,
> Jani.
> 
> 
> >
> > ./arch/microblaze/include/asm/cacheflush.h:/* Look at 
> > Documentation/cachetlb.txt */
> > ./arch/unicore32/include/asm/cacheflush.h: *See 
> > Documentation/cachetlb.txt for more information.
> > ./arch/arm64/include/asm/cacheflush.h: *See Documentation/cachetlb.txt 
> > for more information. Please note that
> > ./arch/arm/include/asm/cacheflush.h: *  See Documentation/cachetlb.txt 
> > for more information.
> > ./arch/xtensa/include/asm/cacheflush.h: * (see also 
> > Documentation/cachetlb.txt)
> > ./arch/xtensa/include/asm/cacheflush.h:/* This is not required, see 
> > Documentation/cachetlb.txt */
> >
> >   Andrea
> >
> >
> >> 
> >> diff --git a/Documentation/00-INDEX b/Documentation/00-INDEX
> >> index 53699c79ee54..04074059bcdc 100644
> >> --- a/Documentation/00-INDEX
> >> +++ b/Documentation/00-INDEX
> >> @@ -76,8 +76,6 @@ bus-devices/
> >>- directory with info on TI GPMC (General Purpose Memory Controller)
> >>  bus-virt-phys-mapping.txt
> >>- how to access I/O mapped memory from within device drivers.
> >> -cachetlb.txt
> >> -  - describes the cache/TLB flushing interfaces Linux uses.
> >>  cdrom/
> >>- directory with information on the CD-ROM drivers that Linux has.
> >>  cgroup-v1/
> >> diff --git a/Documentation/cachetlb.txt 
> >> b/Documentation/core-api/cachetlb.rst
> >> similarity index 100%
> >> rename from Documentation/cachetlb.txt
> >> rename to Documentation/core-api/cachetlb.rst
> >> diff --git a/Documentation/core-api/index.rst 
> >> b/Documentation/core-api/index.rst
> >> index c670a8031786..d4d71ee564ae 100644
> >> --- a/Documentation/core-api/index.rst
> >> +++ b/Documentation/core-api/index.rst
> >> @@ -14,6 +14,7 @@ Core utilities
> >> kernel-api
> >> assoc_array
> >> atomic_ops
> >> +   cachetlb
> >> refcount-vs-atomic
> >> cpu_hotplug
> >> idr
> >> diff --git a/Documentation/memory-barriers.txt 
> >> b/Documentation/memory-barriers.txt
> >> index 6dafc8085acc..983249906fc6 100644
> >> --- a/Documentation/memory-barriers.txt
> >> +++ b/Documentation/memory-barriers.txt
> >> @@ -2903,7 +2903,7 @@ is discarded from the CPU's cache and reloaded.  To 
> >> deal with this, the
> >>  appropriate part of the kernel must invalidate the overlapping bits of the
> >>  cache on each CPU.
> >>  
> >> -See Documentation/cachetlb.txt for more information on cache management.
> >> +See Documentation/core-api/cachetlb.rst for more information on cache 
> >> management.
> >>  
> >>  
> >>  CACHE COHERENCY VS MMIO
> >> diff --git a/Documentation/translations/ko_KR/mem

[PATCH] certificate handling: Update references to the documentation

2018-05-07 Thread Andrea Parri

Commit 94e980cc45f2b2 ("Documentation/module-signing.txt: convert to
ReST markup") converted the .txt doc. to ReST markup, but it did not
update the references to the doc. (including in MAINTAINERS).

Signed-off-by: Andrea Parri 
---
 MAINTAINERS   | 2 +-
 certs/Kconfig | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index df6e9bb2559af..803d4b4ff5f1d 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3347,7 +3347,7 @@ M:David Howells 
 M: David Woodhouse 
 L: keyri...@vger.kernel.org
 S: Maintained
-F: Documentation/module-signing.txt
+F: Documentation/admin-guide/module-signing.rst
 F: certs/
 F: scripts/sign-file.c
 F: scripts/extract-cert.c
diff --git a/certs/Kconfig b/certs/Kconfig
index 5f7663df6e8e3..c94e93d8bccf0 100644
--- a/certs/Kconfig
+++ b/certs/Kconfig
@@ -13,7 +13,7 @@ config MODULE_SIG_KEY
 
  If this option is unchanged from its default "certs/signing_key.pem",
  then the kernel will automatically generate the private key and
- certificate as described in Documentation/module-signing.txt
+ certificate as described in 
Documentation/admin-guide/module-signing.rst
 
 config SYSTEM_TRUSTED_KEYRING
bool "Provide system-wide ring of trusted keys"
-- 
2.7.4

Re: [PATCH 07/18] docs: core-api: add circular-buffers documentation

2018-05-07 Thread Andrea Parri

On Mon, May 07, 2018 at 06:35:43AM -0300, Mauro Carvalho Chehab wrote:
> The circular-buffers.txt is already in ReST format. So, move it to the
> core-api guide, where it belongs.
> 
> Signed-off-by: Mauro Carvalho Chehab 
> ---
>  Documentation/00-INDEX  | 2 --
>  .../{circular-buffers.txt => core-api/circular-buffers.rst} | 0
>  Documentation/core-api/index.rst| 1 +
>  Documentation/memory-barriers.txt   | 2 +-
>  Documentation/translations/ko_KR/memory-barriers.txt| 2 +-
>  5 files changed, 3 insertions(+), 4 deletions(-)
>  rename Documentation/{circular-buffers.txt => core-api/circular-buffers.rst} 
> (100%)

Similarly:

./include/linux/circ_buf.h: * See Documentation/circular-buffers.txt for more 
information.
./drivers/lightnvm/pblk-rb.c: * (Documentation/circular-buffers.txt)
./drivers/media/dvb-core/dvb_ringbuffer.c:   * for memory barriers also see 
Documentation/circular-buffers.txt

  Andrea


> 
> diff --git a/Documentation/00-INDEX b/Documentation/00-INDEX
> index c6b81ef9827b..a9dd1384d8e3 100644
> --- a/Documentation/00-INDEX
> +++ b/Documentation/00-INDEX
> @@ -80,8 +80,6 @@ cdrom/
>   - directory with information on the CD-ROM drivers that Linux has.
>  cgroup-v1/
>   - cgroups v1 features, including cpusets and memory controller.
> -circular-buffers.txt
> - - how to make use of the existing circular buffer infrastructure
>  clk.txt
>   - info on the common clock framework
>  cma/
> diff --git a/Documentation/circular-buffers.txt 
> b/Documentation/core-api/circular-buffers.rst
> similarity index 100%
> rename from Documentation/circular-buffers.txt
> rename to Documentation/core-api/circular-buffers.rst
> diff --git a/Documentation/core-api/index.rst 
> b/Documentation/core-api/index.rst
> index d4d71ee564ae..3864de589126 100644
> --- a/Documentation/core-api/index.rst
> +++ b/Documentation/core-api/index.rst
> @@ -26,6 +26,7 @@ Core utilities
> genalloc
> errseq
> printk-formats
> +   circular-buffers
>  
>  Interfaces for kernel debugging
>  ===
> diff --git a/Documentation/memory-barriers.txt 
> b/Documentation/memory-barriers.txt
> index 983249906fc6..33b8bc9573f8 100644
> --- a/Documentation/memory-barriers.txt
> +++ b/Documentation/memory-barriers.txt
> @@ -3083,7 +3083,7 @@ CIRCULAR BUFFERS
>  Memory barriers can be used to implement circular buffering without the need
>  of a lock to serialise the producer with the consumer.  See:
>  
> - Documentation/circular-buffers.txt
> + Documentation/core-api/circular-buffers.rst
>  
>  for details.
>  
> diff --git a/Documentation/translations/ko_KR/memory-barriers.txt 
> b/Documentation/translations/ko_KR/memory-barriers.txt
> index 081937577c1a..2ec5fe0c9cf4 100644
> --- a/Documentation/translations/ko_KR/memory-barriers.txt
> +++ b/Documentation/translations/ko_KR/memory-barriers.txt
> @@ -3023,7 +3023,7 @@ smp_mb() 가 아니라 virt_mb() 를 사용해야 합니다.
>  동기화에 락을 사용하지 않고 구현하는데에 사용될 수 있습니다.  더 자세한 내용을
>  위해선 다음을 참고하세요:
>  
> - Documentation/circular-buffers.txt
> + Documentation/core-api/circular-buffers.rst
>  
>  
>  =
> -- 
> 2.17.0
>

Re: [PATCH 05/18] docs: core-api: add cachetlb documentation

2018-05-07 Thread Andrea Parri

On Mon, May 07, 2018 at 06:35:41AM -0300, Mauro Carvalho Chehab wrote:
> The cachetlb.txt is already in ReST format. So, move it to the
> core-api guide, where it belongs.
> 
> Signed-off-by: Mauro Carvalho Chehab 
> ---
>  Documentation/00-INDEX| 2 --
>  Documentation/{cachetlb.txt => core-api/cachetlb.rst} | 0
>  Documentation/core-api/index.rst  | 1 +
>  Documentation/memory-barriers.txt | 2 +-
>  Documentation/translations/ko_KR/memory-barriers.txt  | 2 +-
>  5 files changed, 3 insertions(+), 4 deletions(-)
>  rename Documentation/{cachetlb.txt => core-api/cachetlb.rst} (100%)

I see a few "inline" references to the .txt file in -rc4 (see below):
I am not sure if you managed to update them too.

./arch/microblaze/include/asm/cacheflush.h:/* Look at 
Documentation/cachetlb.txt */
./arch/unicore32/include/asm/cacheflush.h: *See Documentation/cachetlb.txt 
for more information.
./arch/arm64/include/asm/cacheflush.h: *See Documentation/cachetlb.txt 
for more information. Please note that
./arch/arm/include/asm/cacheflush.h: *  See Documentation/cachetlb.txt for more 
information.
./arch/xtensa/include/asm/cacheflush.h: * (see also Documentation/cachetlb.txt)
./arch/xtensa/include/asm/cacheflush.h:/* This is not required, see 
Documentation/cachetlb.txt */

  Andrea


> 
> diff --git a/Documentation/00-INDEX b/Documentation/00-INDEX
> index 53699c79ee54..04074059bcdc 100644
> --- a/Documentation/00-INDEX
> +++ b/Documentation/00-INDEX
> @@ -76,8 +76,6 @@ bus-devices/
>   - directory with info on TI GPMC (General Purpose Memory Controller)
>  bus-virt-phys-mapping.txt
>   - how to access I/O mapped memory from within device drivers.
> -cachetlb.txt
> - - describes the cache/TLB flushing interfaces Linux uses.
>  cdrom/
>   - directory with information on the CD-ROM drivers that Linux has.
>  cgroup-v1/
> diff --git a/Documentation/cachetlb.txt b/Documentation/core-api/cachetlb.rst
> similarity index 100%
> rename from Documentation/cachetlb.txt
> rename to Documentation/core-api/cachetlb.rst
> diff --git a/Documentation/core-api/index.rst 
> b/Documentation/core-api/index.rst
> index c670a8031786..d4d71ee564ae 100644
> --- a/Documentation/core-api/index.rst
> +++ b/Documentation/core-api/index.rst
> @@ -14,6 +14,7 @@ Core utilities
> kernel-api
> assoc_array
> atomic_ops
> +   cachetlb
> refcount-vs-atomic
> cpu_hotplug
> idr
> diff --git a/Documentation/memory-barriers.txt 
> b/Documentation/memory-barriers.txt
> index 6dafc8085acc..983249906fc6 100644
> --- a/Documentation/memory-barriers.txt
> +++ b/Documentation/memory-barriers.txt
> @@ -2903,7 +2903,7 @@ is discarded from the CPU's cache and reloaded.  To 
> deal with this, the
>  appropriate part of the kernel must invalidate the overlapping bits of the
>  cache on each CPU.
>  
> -See Documentation/cachetlb.txt for more information on cache management.
> +See Documentation/core-api/cachetlb.rst for more information on cache 
> management.
>  
>  
>  CACHE COHERENCY VS MMIO
> diff --git a/Documentation/translations/ko_KR/memory-barriers.txt 
> b/Documentation/translations/ko_KR/memory-barriers.txt
> index 0a0930ab4156..081937577c1a 100644
> --- a/Documentation/translations/ko_KR/memory-barriers.txt
> +++ b/Documentation/translations/ko_KR/memory-barriers.txt
> @@ -2846,7 +2846,7 @@ CPU 의 캐시에서 RAM 으로 쓰여지는 더티 캐시 라인에 의해 덮
>  문제를 해결하기 위해선, 커널의 적절한 부분에서 각 CPU 의 캐시 안의 문제가 되는
>  비트들을 무효화 시켜야 합니다.
>  
> -캐시 관리에 대한 더 많은 정보를 위해선 Documentation/cachetlb.txt 를
> +캐시 관리에 대한 더 많은 정보를 위해선 Documentation/core-api/cachetlb.rst 를
>  참고하세요.
>  
>  
> -- 
> 2.17.0
>

[RFC PATCH v3 4/6] Documentation/features/locking: Use '!RWSEM_GENERIC_SPINLOCK' as Kconfig for 'rwsem-optimized'

2018-05-07 Thread Andrea Parri

Uses '!RWSEM_GENERIC_SPINLOCK' in place of 'Optimized asm/rwsem.h' as
Kconfig for 'rwsem-optimized': the new Kconfig expresses this feature
equivalently, while also enabling the script 'features-refresh.sh' to
operate on the corresponding arch support status file. Also refreshes
the status matrix by using the script 'features-refresh.sh'.

Suggested-by: Ingo Molnar 
Signed-off-by: Andrea Parri 
Cc: Ingo Molnar 
Cc: Jonathan Corbet 
Cc: Andrew Morton 
---
 Documentation/features/locking/rwsem-optimized/arch-support.txt | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/Documentation/features/locking/rwsem-optimized/arch-support.txt 
b/Documentation/features/locking/rwsem-optimized/arch-support.txt
index 8afe24ffa3ab4..e54b1f1a8091d 100644
--- a/Documentation/features/locking/rwsem-optimized/arch-support.txt
+++ b/Documentation/features/locking/rwsem-optimized/arch-support.txt
@@ -1,6 +1,6 @@
 #
 # Feature name:  rwsem-optimized
-# Kconfig:   Optimized asm/rwsem.h
+# Kconfig:   !RWSEM_GENERIC_SPINLOCK
 # description:   arch provides optimized rwsem APIs
 #
 ---
@@ -8,8 +8,8 @@
 ---
 |   alpha: |  ok  |
 | arc: | TODO |
-| arm: | TODO |
-|   arm64: | TODO |
+| arm: |  ok  |
+|   arm64: |  ok  |
 | c6x: | TODO |
 |   h8300: | TODO |
 | hexagon: | TODO |
@@ -26,7 +26,7 @@
 |s390: |  ok  |
 |  sh: |  ok  |
 |   sparc: |  ok  |
-|  um: | TODO |
+|  um: |  ok  |
 |   unicore32: | TODO |
 | x86: |  ok  |
 |  xtensa: |  ok  |
-- 
2.7.4

[RFC PATCH v3 3/6] Documentation/features/core: Add arch support status files for 'cBPF-JIT' and 'eBPF-JIT'

2018-05-07 Thread Andrea Parri

Commit 606b5908e split 'HAVE_BPF_JIT' into cBPF and eBPF variant.
Adds arch support status files for the new variants, and removes the
status file corresponding to 'HAVE_BPT_JIT'. The new status matrices
were auto-generated using the script 'features-refresh.sh'.

Signed-off-by: Andrea Parri 
Cc: Ingo Molnar 
Cc: Jonathan Corbet 
Cc: Andrew Morton 
---
 .../features/core/BPF-JIT/arch-support.txt | 33 --
 .../features/core/cBPF-JIT/arch-support.txt| 33 ++
 .../features/core/eBPF-JIT/arch-support.txt| 33 ++
 3 files changed, 66 insertions(+), 33 deletions(-)
 delete mode 100644 Documentation/features/core/BPF-JIT/arch-support.txt
 create mode 100644 Documentation/features/core/cBPF-JIT/arch-support.txt
 create mode 100644 Documentation/features/core/eBPF-JIT/arch-support.txt

diff --git a/Documentation/features/core/BPF-JIT/arch-support.txt 
b/Documentation/features/core/BPF-JIT/arch-support.txt
deleted file mode 100644
index d277f971ccd6b..0
--- a/Documentation/features/core/BPF-JIT/arch-support.txt
+++ /dev/null
@@ -1,33 +0,0 @@
-#
-# Feature name:  BPF-JIT
-# Kconfig:   HAVE_BPF_JIT
-# description:   arch supports BPF JIT optimizations
-#
----
-| arch |status|
----
-|   alpha: | TODO |
-| arc: | TODO |
-| arm: |  ok  |
-|   arm64: |  ok  |
-| c6x: | TODO |
-|   h8300: | TODO |
-| hexagon: | TODO |
-|ia64: | TODO |
-|m68k: | TODO |
-|  microblaze: | TODO |
-|mips: |  ok  |
-|   nds32: | TODO |
-|   nios2: | TODO |
-|openrisc: | TODO |
-|  parisc: | TODO |
-| powerpc: |  ok  |
-|   riscv: | TODO |
-|s390: |  ok  |
-|  sh: | TODO |
-|   sparc: |  ok  |
-|  um: | TODO |
-|   unicore32: | TODO |
-| x86: |  ok  |
-|  xtensa: | TODO |
----
diff --git a/Documentation/features/core/cBPF-JIT/arch-support.txt 
b/Documentation/features/core/cBPF-JIT/arch-support.txt
new file mode 100644
index 0..90459cdde3143
--- /dev/null
+++ b/Documentation/features/core/cBPF-JIT/arch-support.txt
@@ -0,0 +1,33 @@
+#
+# Feature name:  cBPF-JIT
+# Kconfig:   HAVE_CBPF_JIT
+# description:   arch supports cBPF JIT optimizations
+#
+---
+| arch |status|
+---
+|   alpha: | TODO |
+| arc: | TODO |
+| arm: | TODO |
+|   arm64: | TODO |
+| c6x: | TODO |
+|   h8300: | TODO |
+| hexagon: | TODO |
+|ia64: | TODO |
+|m68k: | TODO |
+|  microblaze: | TODO |
+|mips: |  ok  |
+|   nds32: | TODO |
+|   nios2: | TODO |
+|openrisc: | TODO |
+|  parisc: | TODO |
+| powerpc: |  ok  |
+|   riscv: | TODO |
+|s390: | TODO |
+|  sh: | TODO |
+|   sparc: |  ok  |
+|  um: | TODO |
+|   unicore32: | TODO |
+| x86: | TODO |
+|  xtensa: | TODO |
+---
diff --git a/Documentation/features/core/eBPF-JIT/arch-support.txt 
b/Documentation/features/core/eBPF-JIT/arch-support.txt
new file mode 100644
index 0..c90a0382fe667
--- /dev/null
+++ b/Documentation/features/core/eBPF-JIT/arch-support.txt
@@ -0,0 +1,33 @@
+#
+# Feature name:  eBPF-JIT
+# Kconfig:   HAVE_EBPF_JIT
+# description:   arch supports eBPF JIT optimizations
+#
+---
+| arch |status|
+---
+|   alpha: | TODO |
+| arc: | TODO |
+| arm: |  ok  |
+|   arm64: |  ok  |
+| c6x: | TODO |
+|   h8300: | TODO |
+| hexagon: | TODO |
+|ia64: | TODO |
+|m68k: | TODO |
+|  microblaze: | TODO |
+|mips: |  ok  |
+|   nds32: | TODO |
+|   nios2: | TODO |
+|openrisc: | TODO |
+|  parisc: | TODO |
+| powerpc: |  ok  |
+|   riscv: | TODO |
+|s390: |  ok  |
+|  sh: | TODO |
+|   sparc: |  ok  |
+|  um: | TODO |
+|   unicore32: | TODO |
+| x86: |  ok  |
+|  xtensa: | TODO |
+---
-- 
2.7.4

[RFC PATCH v3 5/6] Documentation/features/lib: Remove arch support status file for 'strncasecmp'

2018-05-07 Thread Andrea Parri

Suggested-by: Ingo Molnar 
Signed-off-by: Andrea Parri 
Cc: Ingo Molnar 
Cc: Jonathan Corbet 
Cc: Andrew Morton 
---
 .../features/lib/strncasecmp/arch-support.txt  | 33 --
 1 file changed, 33 deletions(-)
 delete mode 100644 Documentation/features/lib/strncasecmp/arch-support.txt

diff --git a/Documentation/features/lib/strncasecmp/arch-support.txt 
b/Documentation/features/lib/strncasecmp/arch-support.txt
deleted file mode 100644
index 6148f42c3d902..0
--- a/Documentation/features/lib/strncasecmp/arch-support.txt
+++ /dev/null
@@ -1,33 +0,0 @@
-#
-# Feature name:  strncasecmp
-# Kconfig:   __HAVE_ARCH_STRNCASECMP
-# description:   arch provides an optimized strncasecmp() function
-#
----
-| arch |status|
----
-|   alpha: | TODO |
-| arc: | TODO |
-| arm: | TODO |
-|   arm64: | TODO |
-| c6x: | TODO |
-|   h8300: | TODO |
-| hexagon: | TODO |
-|ia64: | TODO |
-|m68k: | TODO |
-|  microblaze: | TODO |
-|mips: | TODO |
-|   nds32: | TODO |
-|   nios2: | TODO |
-|openrisc: | TODO |
-|  parisc: | TODO |
-| powerpc: | TODO |
-|   riscv: | TODO |
-|s390: | TODO |
-|  sh: | TODO |
-|   sparc: | TODO |
-|  um: | TODO |
-|   unicore32: | TODO |
-| x86: | TODO |
-|  xtensa: | TODO |
----
-- 
2.7.4

[RFC PATCH v3 6/6] Documentation/features/vm: Remove arch support status file for 'pte_special'

2018-05-07 Thread Andrea Parri

Suggested-by: Ingo Molnar 
Signed-off-by: Andrea Parri 
Cc: Ingo Molnar 
Cc: Jonathan Corbet 
Cc: Andrew Morton 
---
 .../features/vm/pte_special/arch-support.txt   | 33 --
 1 file changed, 33 deletions(-)
 delete mode 100644 Documentation/features/vm/pte_special/arch-support.txt

diff --git a/Documentation/features/vm/pte_special/arch-support.txt 
b/Documentation/features/vm/pte_special/arch-support.txt
deleted file mode 100644
index 6a608a6dcf71d..0
--- a/Documentation/features/vm/pte_special/arch-support.txt
+++ /dev/null
@@ -1,33 +0,0 @@
-#
-# Feature name:  pte_special
-# Kconfig:   __HAVE_ARCH_PTE_SPECIAL
-# description:   arch supports the pte_special()/pte_mkspecial() VM 
APIs
-#
----
-| arch |status|
----
-|   alpha: | TODO |
-| arc: |  ok  |
-| arm: |  ok  |
-|   arm64: |  ok  |
-| c6x: | TODO |
-|   h8300: | TODO |
-| hexagon: | TODO |
-|ia64: | TODO |
-|m68k: | TODO |
-|  microblaze: | TODO |
-|mips: | TODO |
-|   nds32: | TODO |
-|   nios2: | TODO |
-|openrisc: | TODO |
-|  parisc: | TODO |
-| powerpc: |  ok  |
-|   riscv: | TODO |
-|s390: |  ok  |
-|  sh: |  ok  |
-|   sparc: |  ok  |
-|  um: | TODO |
-|   unicore32: | TODO |
-| x86: |  ok  |
-|  xtensa: | TODO |
----
-- 
2.7.4

[RFC PATCH v3 2/6] Documentation/features: Refresh the arch support status files in place

2018-05-07 Thread Andrea Parri

Now that the script 'features-refresh.sh' is available, uses this script
to refresh all the arch-support.txt files in place.

Signed-off-by: Andrea Parri 
Cc: Ingo Molnar 
Cc: Jonathan Corbet 
Cc: Andrew Morton 
---
 Documentation/features/core/BPF-JIT/arch-support.txt   |  2 ++
 .../features/core/generic-idle-thread/arch-support.txt |  4 +++-
 Documentation/features/core/jump-labels/arch-support.txt   |  2 ++
 Documentation/features/core/tracehook/arch-support.txt |  2 ++
 Documentation/features/debug/KASAN/arch-support.txt|  4 +++-
 Documentation/features/debug/gcov-profile-all/arch-support.txt |  2 ++
 Documentation/features/debug/kgdb/arch-support.txt |  4 +++-
 .../features/debug/kprobes-on-ftrace/arch-support.txt  |  2 ++
 Documentation/features/debug/kprobes/arch-support.txt  |  4 +++-
 Documentation/features/debug/kretprobes/arch-support.txt   |  4 +++-
 Documentation/features/debug/optprobes/arch-support.txt|  4 +++-
 Documentation/features/debug/stackprotector/arch-support.txt   |  2 ++
 Documentation/features/debug/uprobes/arch-support.txt  |  6 --
 .../features/debug/user-ret-profiler/arch-support.txt  |  2 ++
 Documentation/features/io/dma-api-debug/arch-support.txt   |  2 ++
 Documentation/features/io/dma-contiguous/arch-support.txt  |  4 +++-
 Documentation/features/io/sg-chain/arch-support.txt|  2 ++
 Documentation/features/lib/strncasecmp/arch-support.txt|  2 ++
 Documentation/features/locking/cmpxchg-local/arch-support.txt  |  4 +++-
 Documentation/features/locking/lockdep/arch-support.txt|  4 +++-
 Documentation/features/locking/queued-rwlocks/arch-support.txt | 10 ++
 .../features/locking/queued-spinlocks/arch-support.txt |  8 +---
 .../features/locking/rwsem-optimized/arch-support.txt  |  2 ++
 Documentation/features/perf/kprobes-event/arch-support.txt |  6 --
 Documentation/features/perf/perf-regs/arch-support.txt |  4 +++-
 Documentation/features/perf/perf-stackdump/arch-support.txt|  4 +++-
 .../features/sched/membarrier-sync-core/arch-support.txt   |  2 ++
 Documentation/features/sched/numa-balancing/arch-support.txt   |  6 --
 Documentation/features/seccomp/seccomp-filter/arch-support.txt |  6 --
 .../features/time/arch-tick-broadcast/arch-support.txt |  4 +++-
 Documentation/features/time/clockevents/arch-support.txt   |  4 +++-
 Documentation/features/time/context-tracking/arch-support.txt  |  2 ++
 Documentation/features/time/irq-time-acct/arch-support.txt |  4 +++-
 .../features/time/modern-timekeeping/arch-support.txt  |  2 ++
 Documentation/features/time/virt-cpuacct/arch-support.txt  |  2 ++
 Documentation/features/vm/ELF-ASLR/arch-support.txt|  4 +++-
 Documentation/features/vm/PG_uncached/arch-support.txt |  2 ++
 Documentation/features/vm/THP/arch-support.txt |  2 ++
 Documentation/features/vm/TLB/arch-support.txt |  2 ++
 Documentation/features/vm/huge-vmap/arch-support.txt   |  2 ++
 Documentation/features/vm/ioremap_prot/arch-support.txt|  2 ++
 Documentation/features/vm/numa-memblock/arch-support.txt   |  4 +++-
 Documentation/features/vm/pte_special/arch-support.txt |  2 ++
 43 files changed, 117 insertions(+), 31 deletions(-)

diff --git a/Documentation/features/core/BPF-JIT/arch-support.txt 
b/Documentation/features/core/BPF-JIT/arch-support.txt
index 0b96b4e1e7d4a..d277f971ccd6b 100644
--- a/Documentation/features/core/BPF-JIT/arch-support.txt
+++ b/Documentation/features/core/BPF-JIT/arch-support.txt
@@ -17,10 +17,12 @@
 |m68k: | TODO |
 |  microblaze: | TODO |
 |mips: |  ok  |
+|   nds32: | TODO |
 |   nios2: | TODO |
 |openrisc: | TODO |
 |  parisc: | TODO |
 | powerpc: |  ok  |
+|   riscv: | TODO |
 |s390: |  ok  |
 |  sh: | TODO |
 |   sparc: |  ok  |
diff --git a/Documentation/features/core/generic-idle-thread/arch-support.txt 
b/Documentation/features/core/generic-idle-thread/arch-support.txt
index 372a2b18a6172..0ef6acdb991c7 100644
--- a/Documentation/features/core/generic-idle-thread/arch-support.txt
+++ b/Documentation/features/core/generic-idle-thread/arch-support.txt
@@ -17,10 +17,12 @@
 |m68k: | TODO |
 |  microblaze: | TODO |
 |mips: |  ok  |
+|   nds32: | TODO |
 |   nios2: | TODO |
-|openrisc: | TODO |
+|openrisc: |  ok  |
 |  parisc: |  ok  |
 | powerpc: |  ok  |
+|   riscv: |  ok  |
 |s390: |  ok  |
 |  sh: |  ok  |
 |   sparc: |  ok  |
diff --git a/Documentation/features/core/jump-labels/arch-support.txt 
b/Documentation/features/core/jump-labels/arch-support.txt
index ad97217b003ba..27cbd63abfd28 100644
--- a/Doc

[RFC PATCH v3 1/6] Documentation/features: Add script that refreshes the arch support status files in place

2018-05-07 Thread Andrea Parri

Provides the script:

Documentation/features/scripts/features-refresh.sh

which operates on the arch-support.txt files and refreshes them in place.

This way [1],

   "[...] we soft- decouple the refreshing of the entries from the
introduction of the features, while still making it all easy to
keep sync and to extend."

[1] http://lkml.kernel.org/r/20180328122211.GA25420@andrea

Suggested-by: Ingo Molnar 
Signed-off-by: Andrea Parri 
Cc: Ingo Molnar 
Cc: Jonathan Corbet 
Cc: Andrew Morton 
---
 Documentation/features/scripts/features-refresh.sh | 98 ++
 1 file changed, 98 insertions(+)
 create mode 100755 Documentation/features/scripts/features-refresh.sh

diff --git a/Documentation/features/scripts/features-refresh.sh 
b/Documentation/features/scripts/features-refresh.sh
new file mode 100755
index 0..9e72d38a0720e
--- /dev/null
+++ b/Documentation/features/scripts/features-refresh.sh
@@ -0,0 +1,98 @@
+#
+# Small script that refreshes the kernel feature support status in place.
+#
+
+for F_FILE in Documentation/features/*/*/arch-support.txt; do
+   F=$(grep "^# Kconfig:" "$F_FILE" | cut -c26-)
+
+   #
+   # Each feature F is identified by a pair (O, K), where 'O' can
+   # be either the empty string (for 'nop') or "not" (the logical
+   # negation operator '!'); other operators are not supported.
+   #
+   O=""
+   K=$F
+   if [[ "$F" == !* ]]; then
+   O="not"
+   K=$(echo $F | sed -e 's/^!//g')
+   fi
+
+   #
+   # F := (O, K) is 'valid' iff there is a Kconfig file (for some
+   # arch) which contains K.
+   #
+   # Notice that this definition entails an 'asymmetry' between
+   # the case 'O = ""' and the case 'O = "not"'. E.g., F may be
+   # _invalid_ if:
+   #
+   # [case 'O = ""']
+   #   1) no arch provides support for F,
+   #   2) K does not exist (e.g., it was renamed/mis-typed);
+   #
+   # [case 'O = "not"']
+   #   3) all archs provide support for F,
+   #   4) as in (2).
+   #
+   # The rationale for adopting this definition (and, thus, for
+   # keeping the asymmetry) is:
+   #
+   #   We want to be able to 'detect' (2) (or (4)).
+   #
+   # (1) and (3) may further warn the developers about the fact
+   # that K can be removed.
+   #
+   F_VALID="false"
+   for ARCH_DIR in arch/*/; do
+   K_FILES=$(find $ARCH_DIR -name "Kconfig*")
+   K_GREP=$(grep "$K" $K_FILES)
+   if [ ! -z "$K_GREP" ]; then
+   F_VALID="true"
+   break
+   fi
+   done
+   if [ "$F_VALID" = "false" ]; then
+   printf "WARNING: '%s' is not a valid Kconfig\n" "$F"
+   fi
+
+   T_FILE="$F_FILE.tmp"
+   grep "^#" $F_FILE > $T_FILE
+   echo "---" >> $T_FILE
+   echo "| arch |status|" >> $T_FILE
+   echo "---" >> $T_FILE
+   for ARCH_DIR in arch/*/; do
+   ARCH=$(echo $ARCH_DIR | sed -e 's/arch//g' | sed -e 's/\///g')
+   K_FILES=$(find $ARCH_DIR -name "Kconfig*")
+   K_GREP=$(grep "$K" $K_FILES)
+   #
+   # Arch support status values for (O, K) are updated according
+   # to the following rules.
+   #
+   #   - ("", K) is 'supported by a given arch', if there is a
+   # Kconfig file for that arch which contains K;
+   #
+   #   - ("not", K) is 'supported by a given arch', if there is
+   # no Kconfig file for that arch which contains K;
+   #
+   #   - otherwise: preserve the previous status value (if any),
+   #default to 'not yet supported'.
+   #
+   # Notice that, according these rules, invalid features may be
+   # updated/modified.
+   #
+   if [ "$O" = "" ] && [ ! -z "$K_GREP" ]; then
+   printf "|%12s: |  ok  |\n" "$ARCH" >> $T_FILE
+   elif [ "$O" = "not" ] && [ -z "$K_GREP" ]; then
+   printf "|%12s: |  ok  |\n" "$ARCH" >> $T_FILE
+   else
+   S=$(grep -v "^#" "$F_FILE" | grep " $ARCH:")
+   if [ ! -z "$S" ]; then
+   echo "$S" >> $T_FILE
+   else
+   printf "|%12s: | TODO |\n" "$ARCH" \
+   >> $T_FILE
+   fi
+   fi
+   done
+   echo "---" >> $T_FILE
+   mv $T_FILE $F_FILE
+done
-- 
2.7.4

[RFC PATCH v3 0/6] Documentation/features: Provide and apply 'features-refresh.sh'

2018-05-07 Thread Andrea Parri

Hi,

This series provides the script 'features-refresh.sh', which operates on
the arch support status files, and it applies this script to refresh the
status files in place; previous discussions about this series are at [1].

The series is organized as follows.

  - Patch 1/6 adds the script to 'Documentation/features/scripts/'.

  - Patch 2/6 presents the results of running the script; this run
also printed the messages

   WARNING: 'HAVE_BPF_JIT' is not a valid Kconfig
   WARNING: '__HAVE_ARCH_STRNCASECMP' is not a valid Kconfig
   WARNING: 'Optimized asm/rwsem.h' is not a valid Kconfig
   WARNING: '__HAVE_ARCH_PTE_SPECIAL' is not a valid Kconfig

to standard output.

  - Patches 3-6/6 fix each of these warnings.

(Applies on -rc4.)

Cheers,
  Andrea

[1] 
http://lkml.kernel.org/r/1523205027-31786-1-git-send-email-andrea.pa...@amarulasolutions.com

http://lkml.kernel.org/r/1522774551-9503-1-git-send-email-andrea.pa...@amarulasolutions.com
http://lkml.kernel.org/r/20180328122211.GA25420@andrea

Changes in v3:
  - rebase on -rc4

Changes in v2:
  - support negation operators in Kconfig (suggested by Ingo Molnar)
  - reorder patches 2/6 and 3/6 (suggested by Ingo Molnar)
  - add patches 4-6/6 (suggested by Ingo Molnar)

Andrea Parri (6):
  Documentation/features: Add script that refreshes the arch support
status files in place
  Documentation/features: Refresh the arch support status files in place
  Documentation/features/core: Add arch support status files for
'cBPF-JIT' and 'eBPF-JIT'
  Documentation/features/locking: Use '!RWSEM_GENERIC_SPINLOCK' as
Kconfig for 'rwsem-optimized'
  Documentation/features/lib: Remove arch support status file for
'strncasecmp'
  Documentation/features/vm: Remove arch support status file for
'pte_special'

 .../features/core/BPF-JIT/arch-support.txt | 31 ---
 .../features/core/cBPF-JIT/arch-support.txt| 33 
 .../features/core/eBPF-JIT/arch-support.txt| 33 
 .../core/generic-idle-thread/arch-support.txt  |  4 +-
 .../features/core/jump-labels/arch-support.txt |  2 +
 .../features/core/tracehook/arch-support.txt   |  2 +
 .../features/debug/KASAN/arch-support.txt  |  4 +-
 .../debug/gcov-profile-all/arch-support.txt|  2 +
 Documentation/features/debug/kgdb/arch-support.txt |  4 +-
 .../debug/kprobes-on-ftrace/arch-support.txt   |  2 +
 .../features/debug/kprobes/arch-support.txt|  4 +-
 .../features/debug/kretprobes/arch-support.txt |  4 +-
 .../features/debug/optprobes/arch-support.txt  |  4 +-
 .../features/debug/stackprotector/arch-support.txt |  2 +
 .../features/debug/uprobes/arch-support.txt|  6 +-
 .../debug/user-ret-profiler/arch-support.txt   |  2 +
 .../features/io/dma-api-debug/arch-support.txt |  2 +
 .../features/io/dma-contiguous/arch-support.txt|  4 +-
 .../features/io/sg-chain/arch-support.txt  |  2 +
 .../features/lib/strncasecmp/arch-support.txt  | 31 ---
 .../locking/cmpxchg-local/arch-support.txt |  4 +-
 .../features/locking/lockdep/arch-support.txt  |  4 +-
 .../locking/queued-rwlocks/arch-support.txt| 10 ++-
 .../locking/queued-spinlocks/arch-support.txt  |  8 +-
 .../locking/rwsem-optimized/arch-support.txt   | 10 ++-
 .../features/perf/kprobes-event/arch-support.txt   |  6 +-
 .../features/perf/perf-regs/arch-support.txt   |  4 +-
 .../features/perf/perf-stackdump/arch-support.txt  |  4 +-
 .../sched/membarrier-sync-core/arch-support.txt|  2 +
 .../features/sched/numa-balancing/arch-support.txt |  6 +-
 Documentation/features/scripts/features-refresh.sh | 98 ++
 .../seccomp/seccomp-filter/arch-support.txt|  6 +-
 .../time/arch-tick-broadcast/arch-support.txt  |  4 +-
 .../features/time/clockevents/arch-support.txt |  4 +-
 .../time/context-tracking/arch-support.txt |  2 +
 .../features/time/irq-time-acct/arch-support.txt   |  4 +-
 .../time/modern-timekeeping/arch-support.txt   |  2 +
 .../features/time/virt-cpuacct/arch-support.txt|  2 +
 .../features/vm/ELF-ASLR/arch-support.txt  |  4 +-
 .../features/vm/PG_uncached/arch-support.txt   |  2 +
 Documentation/features/vm/THP/arch-support.txt |  2 +
 Documentation/features/vm/TLB/arch-support.txt |  2 +
 .../features/vm/huge-vmap/arch-support.txt |  2 +
 .../features/vm/ioremap_prot/arch-support.txt  |  2 +
 .../features/vm/numa-memblock/arch-support.txt |  4 +-
 .../features/vm/pte_special/arch-support.txt   | 31 ---
 46 files changed, 279 insertions(+), 128 deletions(-)
 delete mode 100644 Documentation/features/core/BPF-JIT/arch-support.txt
 create mode 100644 Documentation/features/core/cBPF-JIT/arch-support.txt
 create mode 100644 Documentation/features/core/eBPF-JIT/arch-s

Re: [PATCH] locking/atomics: Simplify the op definitions in atomic.h some more

2018-05-07 Thread Andrea Parri

On Sun, May 06, 2018 at 04:57:27PM +0200, Ingo Molnar wrote:
> 
> * Andrea Parri  wrote:
> 
> > Hi Ingo,
> > 
> > > From 5affbf7e91901143f84f1b2ca64f4afe70e210fd Mon Sep 17 00:00:00 2001
> > > From: Ingo Molnar 
> > > Date: Sat, 5 May 2018 10:23:23 +0200
> > > Subject: [PATCH] locking/atomics: Simplify the op definitions in atomic.h 
> > > some more
> > > 
> > > Before:
> > > 
> > >  #ifndef atomic_fetch_dec_relaxed
> > >  # ifndef atomic_fetch_dec
> > >  #  define atomic_fetch_dec(v)atomic_fetch_sub(1, (v))
> > >  #  define atomic_fetch_dec_relaxed(v)
> > > atomic_fetch_sub_relaxed(1, (v))
> > >  #  define atomic_fetch_dec_acquire(v)
> > > atomic_fetch_sub_acquire(1, (v))
> > >  #  define atomic_fetch_dec_release(v)
> > > atomic_fetch_sub_release(1, (v))
> > >  # else
> > >  #  define atomic_fetch_dec_relaxed   atomic_fetch_dec
> > >  #  define atomic_fetch_dec_acquire   atomic_fetch_dec
> > >  #  define atomic_fetch_dec_release   atomic_fetch_dec
> > >  # endif
> > >  #else
> > >  # ifndef atomic_fetch_dec_acquire
> > >  #  define atomic_fetch_dec_acquire(...)  
> > > __atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
> > >  # endif
> > >  # ifndef atomic_fetch_dec_release
> > >  #  define atomic_fetch_dec_release(...)  
> > > __atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
> > >  # endif
> > >  # ifndef atomic_fetch_dec
> > >  #  define atomic_fetch_dec(...)  
> > > __atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
> > >  # endif
> > >  #endif
> > > 
> > > After:
> > > 
> > >  #ifndef atomic_fetch_dec_relaxed
> > >  # ifndef atomic_fetch_dec
> > >  #  define atomic_fetch_dec(v)atomic_fetch_sub(1, (v))
> > >  #  define atomic_fetch_dec_relaxed(v)
> > > atomic_fetch_sub_relaxed(1, (v))
> > >  #  define atomic_fetch_dec_acquire(v)
> > > atomic_fetch_sub_acquire(1, (v))
> > >  #  define atomic_fetch_dec_release(v)
> > > atomic_fetch_sub_release(1, (v))
> > >  # else
> > >  #  define atomic_fetch_dec_relaxed   atomic_fetch_dec
> > >  #  define atomic_fetch_dec_acquire   atomic_fetch_dec
> > >  #  define atomic_fetch_dec_release   atomic_fetch_dec
> > >  # endif
> > >  #else
> > >  # ifndef atomic_fetch_dec
> > >  #  define atomic_fetch_dec(...)  
> > > __atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
> > >  #  define atomic_fetch_dec_acquire(...)  
> > > __atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
> > >  #  define atomic_fetch_dec_release(...)  
> > > __atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
> > >  # endif
> > >  #endif
> > > 
> > > The idea is that because we already group these APIs by certain defines
> > > such as atomic_fetch_dec_relaxed and atomic_fetch_dec in the primary
> > > branches - we can do the same in the secondary branch as well.
> > > 
> > > ( Also remove some unnecessarily duplicate comments, as the API
> > >   group defines are now pretty much self-documenting. )
> > > 
> > > No change in functionality.
> > > 
> > > Cc: Peter Zijlstra 
> > > Cc: Linus Torvalds 
> > > Cc: Andrew Morton 
> > > Cc: Thomas Gleixner 
> > > Cc: Paul E. McKenney 
> > > Cc: Will Deacon 
> > > Cc: linux-kernel@vger.kernel.org
> > > Signed-off-by: Ingo Molnar 
> > 
> > This breaks compilation on RISC-V. (For some of its atomics, the arch
> > currently defines the _relaxed and the full variants and it relies on
> > the generic definitions for the _acquire and the _release variants.)
> 
> I don't have cross-compilation for RISC-V, which is a relatively new arch.
> (Is there any RISC-V set of cross-compilation tools on kernel.org somewhere?)

I'm using the toolchain from:

  https://riscv.org/software-tools/

(adding Palmer and Albert in Cc:)


> 
> Could you please send a patch that defines those variants against Linus's 
> tree, 
> like the PowerPC patch that does something similar:
> 
>   0476a632cb3a: locking/atomics/powerpc: Move cmpxchg helpers to 
> asm/cmpxchg.h and define the full set of cmpxchg APIs
> 
> ?

Yes, please see below for a first RFC.

(BTW, get_maintainer.pl says that th

Re: [PATCH] locking/atomics: Combine the atomic_andnot() and atomic64_andnot() API definitions

2018-05-06 Thread Andrea Parri

Hi Ingo,

> From f5efafa83af8c46b9e81b010b46caeeadb450179 Mon Sep 17 00:00:00 2001
> From: Ingo Molnar 
> Date: Sat, 5 May 2018 10:46:41 +0200
> Subject: [PATCH] locking/atomics: Combine the atomic_andnot() and 
> atomic64_andnot() API definitions
> 
> The atomic_andnot() and atomic64_andnot() are defined in 4 separate groups
> spred out in the atomic.h header:
> 
>  #ifdef atomic_andnot
>  ...
>  #endif /* atomic_andnot */
>  ...
>  #ifndef atomic_andnot
>  ...
>  #endif
>  ...
>  #ifdef atomic64_andnot
>  ...
>  #endif /* atomic64_andnot */
>  ...
>  #ifndef atomic64_andnot
>  ...
>  #endif
> 
> Combine them into unify them into two groups:

Nit: "Combine them into unify them into"

  Andrea


> 
>  #ifdef atomic_andnot
>  #else
>  #endif
> 
>  ...
> 
>  #ifdef atomic64_andnot
>  #else
>  #endif
> 
> So that one API group is defined in a single place within the header.
> 
> Cc: Peter Zijlstra 
> Cc: Linus Torvalds 
> Cc: Andrew Morton 
> Cc: Thomas Gleixner 
> Cc: Paul E. McKenney 
> Cc: Will Deacon 
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Ingo Molnar 
> ---
>  include/linux/atomic.h | 72 
> +-
>  1 file changed, 36 insertions(+), 36 deletions(-)
> 
> diff --git a/include/linux/atomic.h b/include/linux/atomic.h
> index 352ecc72d7f5..1176cf7c6f03 100644
> --- a/include/linux/atomic.h
> +++ b/include/linux/atomic.h
> @@ -205,22 +205,6 @@
>  # endif
>  #endif
>  
> -#ifdef atomic_andnot
> -
> -#ifndef atomic_fetch_andnot_relaxed
> -# define atomic_fetch_andnot_relaxed atomic_fetch_andnot
> -# define atomic_fetch_andnot_acquire atomic_fetch_andnot
> -# define atomic_fetch_andnot_release atomic_fetch_andnot
> -#else
> -# ifndef atomic_fetch_andnot
> -#  define atomic_fetch_andnot(...)   
> __atomic_op_fence(atomic_fetch_andnot, __VA_ARGS__)
> -#  define atomic_fetch_andnot_acquire(...)   
> __atomic_op_acquire(atomic_fetch_andnot, __VA_ARGS__)
> -#  define atomic_fetch_andnot_release(...)   
> __atomic_op_release(atomic_fetch_andnot, __VA_ARGS__)
> -# endif
> -#endif
> -
> -#endif /* atomic_andnot */
> -
>  #ifndef atomic_fetch_xor_relaxed
>  # define atomic_fetch_xor_relaxedatomic_fetch_xor
>  # define atomic_fetch_xor_acquireatomic_fetch_xor
> @@ -338,7 +322,22 @@ static inline int atomic_add_unless(atomic_t *v, int a, 
> int u)
>  # define atomic_inc_not_zero(v)  atomic_add_unless((v), 
> 1, 0)
>  #endif
>  
> -#ifndef atomic_andnot
> +#ifdef atomic_andnot
> +
> +#ifndef atomic_fetch_andnot_relaxed
> +# define atomic_fetch_andnot_relaxed atomic_fetch_andnot
> +# define atomic_fetch_andnot_acquire atomic_fetch_andnot
> +# define atomic_fetch_andnot_release atomic_fetch_andnot
> +#else
> +# ifndef atomic_fetch_andnot
> +#  define atomic_fetch_andnot(...)   
> __atomic_op_fence(atomic_fetch_andnot, __VA_ARGS__)
> +#  define atomic_fetch_andnot_acquire(...)   
> __atomic_op_acquire(atomic_fetch_andnot, __VA_ARGS__)
> +#  define atomic_fetch_andnot_release(...)   
> __atomic_op_release(atomic_fetch_andnot, __VA_ARGS__)
> +# endif
> +#endif
> +
> +#else /* !atomic_andnot: */
> +
>  static inline void atomic_andnot(int i, atomic_t *v)
>  {
>   atomic_and(~i, v);
> @@ -363,7 +362,8 @@ static inline int atomic_fetch_andnot_release(int i, 
> atomic_t *v)
>  {
>   return atomic_fetch_and_release(~i, v);
>  }
> -#endif
> +
> +#endif /* !atomic_andnot */
>  
>  /**
>   * atomic_inc_not_zero_hint - increment if not null
> @@ -600,22 +600,6 @@ static inline int atomic_dec_if_positive(atomic_t *v)
>  # endif
>  #endif
>  
> -#ifdef atomic64_andnot
> -
> -#ifndef atomic64_fetch_andnot_relaxed
> -# define atomic64_fetch_andnot_relaxed   atomic64_fetch_andnot
> -# define atomic64_fetch_andnot_acquire   atomic64_fetch_andnot
> -# define atomic64_fetch_andnot_release   atomic64_fetch_andnot
> -#else
> -# ifndef atomic64_fetch_andnot
> -#  define atomic64_fetch_andnot(...) 
> __atomic_op_fence(atomic64_fetch_andnot, __VA_ARGS__)
> -#  define atomic64_fetch_andnot_acquire(...) 
> __atomic_op_acquire(atomic64_fetch_andnot, __VA_ARGS__)
> -#  define atomic64_fetch_andnot_release(...) 
> __atomic_op_release(atomic64_fetch_andnot, __VA_ARGS__)
> -# endif
> -#endif
> -
> -#endif /* atomic64_andnot */
> -
>  #ifndef atomic64_fetch_xor_relaxed
>  # define atomic64_fetch_xor_relaxed  atomic64_fetch_xor
>  # define atomic64_fetch_xor_acquire  atomic64_fetch_xor
> @@ -672,7 +656,22 @@ static inline int atomic_dec_if_positive(atomic_t *v)
>  # define atomic64_try_cmpxchg_releaseatomic64_try_cmpxchg
>  #endif
>  
> -#ifndef atomic64_andnot
> +#ifdef atomic64_andnot
> +
> +#ifndef atomic64_fetch_andnot_relaxed
> +# define atomic64_fetch_andnot_relaxed   atomic64_fetch_andnot
> +# define atomic64_fetch_andnot_acquire   atomic64_fetch_andnot
> +# define atomic64_fetch_andn

Re: [PATCH] locking/atomics: Simplify the op definitions in atomic.h some more

2018-05-06 Thread Andrea Parri

Hi Ingo,

> From 5affbf7e91901143f84f1b2ca64f4afe70e210fd Mon Sep 17 00:00:00 2001
> From: Ingo Molnar 
> Date: Sat, 5 May 2018 10:23:23 +0200
> Subject: [PATCH] locking/atomics: Simplify the op definitions in atomic.h 
> some more
> 
> Before:
> 
>  #ifndef atomic_fetch_dec_relaxed
>  # ifndef atomic_fetch_dec
>  #  define atomic_fetch_dec(v)atomic_fetch_sub(1, (v))
>  #  define atomic_fetch_dec_relaxed(v)
> atomic_fetch_sub_relaxed(1, (v))
>  #  define atomic_fetch_dec_acquire(v)
> atomic_fetch_sub_acquire(1, (v))
>  #  define atomic_fetch_dec_release(v)
> atomic_fetch_sub_release(1, (v))
>  # else
>  #  define atomic_fetch_dec_relaxed   atomic_fetch_dec
>  #  define atomic_fetch_dec_acquire   atomic_fetch_dec
>  #  define atomic_fetch_dec_release   atomic_fetch_dec
>  # endif
>  #else
>  # ifndef atomic_fetch_dec_acquire
>  #  define atomic_fetch_dec_acquire(...)  
> __atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
>  # endif
>  # ifndef atomic_fetch_dec_release
>  #  define atomic_fetch_dec_release(...)  
> __atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
>  # endif
>  # ifndef atomic_fetch_dec
>  #  define atomic_fetch_dec(...)  
> __atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
>  # endif
>  #endif
> 
> After:
> 
>  #ifndef atomic_fetch_dec_relaxed
>  # ifndef atomic_fetch_dec
>  #  define atomic_fetch_dec(v)atomic_fetch_sub(1, (v))
>  #  define atomic_fetch_dec_relaxed(v)
> atomic_fetch_sub_relaxed(1, (v))
>  #  define atomic_fetch_dec_acquire(v)
> atomic_fetch_sub_acquire(1, (v))
>  #  define atomic_fetch_dec_release(v)
> atomic_fetch_sub_release(1, (v))
>  # else
>  #  define atomic_fetch_dec_relaxed   atomic_fetch_dec
>  #  define atomic_fetch_dec_acquire   atomic_fetch_dec
>  #  define atomic_fetch_dec_release   atomic_fetch_dec
>  # endif
>  #else
>  # ifndef atomic_fetch_dec
>  #  define atomic_fetch_dec(...)  
> __atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
>  #  define atomic_fetch_dec_acquire(...)  
> __atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
>  #  define atomic_fetch_dec_release(...)  
> __atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
>  # endif
>  #endif
> 
> The idea is that because we already group these APIs by certain defines
> such as atomic_fetch_dec_relaxed and atomic_fetch_dec in the primary
> branches - we can do the same in the secondary branch as well.
> 
> ( Also remove some unnecessarily duplicate comments, as the API
>   group defines are now pretty much self-documenting. )
> 
> No change in functionality.
> 
> Cc: Peter Zijlstra 
> Cc: Linus Torvalds 
> Cc: Andrew Morton 
> Cc: Thomas Gleixner 
> Cc: Paul E. McKenney 
> Cc: Will Deacon 
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Ingo Molnar 

This breaks compilation on RISC-V. (For some of its atomics, the arch
currently defines the _relaxed and the full variants and it relies on
the generic definitions for the _acquire and the _release variants.)

  Andrea


> ---
>  include/linux/atomic.h | 312 
> ++---
>  1 file changed, 62 insertions(+), 250 deletions(-)
> 
> diff --git a/include/linux/atomic.h b/include/linux/atomic.h
> index 67aaafba256b..352ecc72d7f5 100644
> --- a/include/linux/atomic.h
> +++ b/include/linux/atomic.h
> @@ -71,98 +71,66 @@
>  })
>  #endif
>  
> -/* atomic_add_return_relaxed() et al: */
> -
>  #ifndef atomic_add_return_relaxed
>  # define atomic_add_return_relaxed   atomic_add_return
>  # define atomic_add_return_acquire   atomic_add_return
>  # define atomic_add_return_release   atomic_add_return
>  #else
> -# ifndef atomic_add_return_acquire
> -#  define atomic_add_return_acquire(...) 
> __atomic_op_acquire(atomic_add_return, __VA_ARGS__)
> -# endif
> -# ifndef atomic_add_return_release
> -#  define atomic_add_return_release(...) 
> __atomic_op_release(atomic_add_return, __VA_ARGS__)
> -# endif
>  # ifndef atomic_add_return
>  #  define atomic_add_return(...) 
> __atomic_op_fence(atomic_add_return, __VA_ARGS__)
> +#  define atomic_add_return_acquire(...) 
> __atomic_op_acquire(atomic_add_return, __VA_ARGS__)
> +#  define atomic_add_return_release(...) 
> __atomic_op_release(atomic_add_return, __VA_ARGS__)
>  # endif
>  #endif
>  
> -/* atomic_inc_return_relaxed() et al: */
> -
>  #ifndef atomic_inc_return_relaxed
>  # define atomic_inc_return_relaxed   atomic_inc_return
>  # define atomic_inc_return_acquire   atomic_inc_return
>  # define atomic_inc_return_release   atomic_inc_return
>  #else
> -# ifndef atomic_inc_return_acquire
> -#  define atomic_inc_return_acquire(...) 
> __atomic_op_acquire(atomic_inc_return, __VA_ARGS__)
> -# endif
> -# ifndef atomic_inc_return_release
> -#  define atomic_inc_return_release(...)

Re: [PATCH] Documentation: refcount-vs-atomic: Update reference to LKMM doc.

2018-05-04 Thread Andrea Parri

On Fri, May 04, 2018 at 02:13:59PM -0700, Kees Cook wrote:
> On Fri, May 4, 2018 at 2:11 PM, Andrea Parri
>  wrote:
> > The LKMM project has moved to 'tools/memory-model/'.
> >
> > Signed-off-by: Andrea Parri 
> > ---
> >  Documentation/core-api/refcount-vs-atomic.rst | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/Documentation/core-api/refcount-vs-atomic.rst 
> > b/Documentation/core-api/refcount-vs-atomic.rst
> > index 83351c258cdb9..322851bada167 100644
> > --- a/Documentation/core-api/refcount-vs-atomic.rst
> > +++ b/Documentation/core-api/refcount-vs-atomic.rst
> > @@ -17,7 +17,7 @@ in order to help maintainers validate their code against 
> > the change in
> >  these memory ordering guarantees.
> >
> >  The terms used through this document try to follow the formal LKMM defined 
> > in
> > -github.com/aparri/memory-model/blob/master/Documentation/explanation.txt
> > +tools/memory-model/Documentation/explanation.txt.
> >
> >  memory-barriers.txt and atomic_t.txt provide more background to the
> >  memory ordering in general and for atomic operations specifically.
> 
> Will this get linkified by rst ?

I believe not, but I'm not too familiar with rst...  FWIW, I'm seeing that
the above memory-barriers.txt, atomic_t.txt are not linkified.

  Andrea


> 
> -Kees
> 
> -- 
> Kees Cook
> Pixel Security

[PATCH] Documentation: refcount-vs-atomic: Update reference to LKMM doc.

2018-05-04 Thread Andrea Parri

The LKMM project has moved to 'tools/memory-model/'.

Signed-off-by: Andrea Parri 
---
 Documentation/core-api/refcount-vs-atomic.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/core-api/refcount-vs-atomic.rst 
b/Documentation/core-api/refcount-vs-atomic.rst
index 83351c258cdb9..322851bada167 100644
--- a/Documentation/core-api/refcount-vs-atomic.rst
+++ b/Documentation/core-api/refcount-vs-atomic.rst
@@ -17,7 +17,7 @@ in order to help maintainers validate their code against the 
change in
 these memory ordering guarantees.
 
 The terms used through this document try to follow the formal LKMM defined in
-github.com/aparri/memory-model/blob/master/Documentation/explanation.txt
+tools/memory-model/Documentation/explanation.txt.
 
 memory-barriers.txt and atomic_t.txt provide more background to the
 memory ordering in general and for atomic operations specifically.
-- 
2.7.4

Re: [PATCH v1] kthread/smpboot: Serialize kthread parking against wakeup

2018-04-26 Thread Andrea Parri

On Thu, Apr 26, 2018 at 10:41:31AM +0200, Peter Zijlstra wrote:

[...]

> +/*
> + * Special states are those that do not use the normal wait-loop pattern. See
> + * the comment with set_special_state().
> + */
> +#define is_special_state(state)  \
> + ((state) == TASK_DEAD ||\
> +  (state) == TASK_STOPPED)
> +
>  #ifdef CONFIG_DEBUG_ATOMIC_SLEEP
>  
> +/*
> + * Assert we don't use the regular *set_current_state() helpers for special
> + * states. See the comment with set_special_state().
> + */
> +#define assert_special_state(state)  WARN_ON_ONCE(is_special_state(state))

Nitpicking, this name suggests "Shout if the state is NOT special" to me:
maybe,

#define assert_special_state(state) WARN_ON_ONCE(!is_special_state(state))
#define assert_regular_state(state) WARN_ON_ONCE(is_special_state(state))

or just do with the WARN_ON_ONCE()s ?

  Andrea


> +
>  #define __set_current_state(state_value) \
>   do {\
> + assert_special_state(state_value);  \
>   current->task_state_change = _THIS_IP_; \
>   current->state = (state_value); \
>   } while (0)
> +
>  #define set_current_state(state_value)   \
>   do {\
> + assert_special_state(state_value);  \
>   current->task_state_change = _THIS_IP_; \
>   smp_store_mb(current->state, (state_value));\
>   } while (0)
>  
> +#define set_special_state(state_value)   
> \
> + do {\
> + unsigned long flags; /* may shadow */   \
> + WARN_ON_ONCE(!is_special_state(state_value));   \
> + raw_spin_lock_irqsave(¤t->pi_lock, flags);\
> + current->task_state_change = _THIS_IP_; \
> + current->state = (state_value); \
> + raw_spin_unlock_irqrestore(¤t->pi_lock, flags);   \
> + } while (0)

Re: [PATCH 4/4] exit: Lockless iteration over task list in mm_update_next_owner()

2018-04-26 Thread Andrea Parri

On Thu, Apr 26, 2018 at 04:52:39PM +0300, Kirill Tkhai wrote:
> On 26.04.2018 15:35, Andrea Parri wrote:

[...]

> > 
> > Mmh, it's possible that I am misunderstanding this statement but it does
> > not seem quite correct to me; a counter-example would be provided by the
> > test at "tools/memory-model/litmus-tests/SB+mbonceonces.litmus" (replace
> > either of the smp_mb() with the sequence:
> > 
> >spin_lock(s); spin_unlock(s); spin_lock(s); spin_unlock(s); ).
> > 
> > BTW, your commit message suggests that your case would work with "imply
> > an smp_wmb()".  This implication should hold "w.r.t. current implementa-
> > tions".  We (LKMM people) discussed changes to the LKMM to make it hold
> > in LKMM but such changes are still in our TODO list as of today...
> 
> I'm not close to LKMM, so the test you referenced is not clear for me.

The test could be concisely described by:

   {initially: x=y=0; }

   Thread0  Thread1

   x = 1;   y = 1;
   MB   MB
   r0 = y;  r1 = x;

   Can r0,r1 be both 0 after joining?

The answer to the question is -No-; however, if you replaced any of the
MB with the locking sequence described above, then the answer is -Yes-:
full fences on both sides are required to forbid that state and this is
something that the locking sequences won't be able to provide (think at
the implementation of these primitives for powerpc, for example).

> Does LKMM show the real hardware behavior? Or there are added the most
> cases, and work is still in progress?

Very roughly speaking, LKMM is an "envelope" of the underlying hardware
memory models/architectures supported by the Linux kernel which in turn
may not coincide with the observable behavior on a given implementation
/processor of that architecture.  Also, LKMM doesn't aim to be a "tight"
envelope.  I'd refer to the documentation within "tools/memory-model/";
please let me know if I can provide further info.

> 
> In the patch I used the logic, that the below code:
> 
>   x = A;
>   spin_lock();
>   spin_unlock();
>   spin_lock();
>   spin_unlock();
>   y = B;
> 
> cannot reorder much than:
> 
>   spin_lock();
>   x = A;  <- this can't become visible later, that spin_unlock()
>   spin_unlock();
>   spin_lock();
>   y = B;  <- this can't become visible earlier, than spin_lock()
>   spin_unlock();
> 
> Is there a problem?

As mentioned in the previous email, if smp_wmb() is what you're looking
for then this should be fine (considering current implementations; LKMM
will likely be there soon...).

BTW, the behavior in question has been recently discussed on the list;
c.f., for example, the test "unlock-lock-write-ordering" described in:

http://lkml.kernel.org/r/1519301990-11766-1-git-send-email-parri.and...@gmail.com

as well as

  0123f4d76ca63b7b895f40089be0ce4809e392d8
  ("riscv/spinlock: Strengthen implementations with fences")

  Andrea

> 
> Kirill

Re: [PATCH 4/4] exit: Lockless iteration over task list in mm_update_next_owner()

2018-04-26 Thread Andrea Parri

Hi Kirill,

On Thu, Apr 26, 2018 at 02:01:07PM +0300, Kirill Tkhai wrote:
> The patch finalizes the series and makes mm_update_next_owner()
> to iterate over task list using RCU instead of tasklist_lock.
> This is possible because of rules of inheritance of mm: it may be
> propagated to a child only, while only kernel thread can obtain
> someone else's mm via use_mm().
> 
> Also, all new tasks are added to tail of tasks list or threads list.
> The only exception is transfer_pid() in de_thread(), when group
> leader is replaced by another thread. But transfer_pid() is called
> in case of successful exec only, where new mm is allocated, so it
> can't be interesting for mm_update_next_owner().
> 
> This patch uses alloc_pid() as a memory barrier, and it's possible
> since it contains two or more spin_lock()/spin_unlock() pairs.
> Single pair does not imply a barrier, while two pairs do imply that.
> 
> There are three barriers:
> 
> 1)for_each_process(g)copy_process()
>p->mm = mm
> smp_rmb(); smp_wmb() implied by alloc_pid()
> if (g->flags & PF_KTHREAD) list_add_tail_rcu(&p->tasks, 
> &init_task.tasks)
> 
> 2)for_each_thread(g, c)  copy_process()
>p->mm = mm
> smp_rmb(); smp_wmb() implied by alloc_pid()
> tmp = READ_ONCE(c->mm) list_add_tail_rcu(&p->thread_node, ...)
> 
> 3)for_each_thread(g, c)  copy_process()
>list_add_tail_rcu(&p->thread_node, ...)
> p->mm != NULL check  do_exit()
> smp_rmb()  smp_mb();
> get next thread in loop  p->mm = NULL
> 
> 
> This patch may be useful for machines with many processes executing.
> I regulary observe mm_update_next_owner() executing on one of the cpus
> in crash dumps (not related to this function) on big machines. Even
> if iteration over task list looks as unlikely situation, it regularity
> grows with the growth of containers/processes numbers.
> 
> Signed-off-by: Kirill Tkhai 
> ---
>  kernel/exit.c |   39 +++
>  kernel/fork.c |1 +
>  kernel/pid.c  |5 -
>  3 files changed, 40 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/exit.c b/kernel/exit.c
> index 40f734ed1193..7ce4cdf96a64 100644
> --- a/kernel/exit.c
> +++ b/kernel/exit.c
> @@ -406,6 +406,8 @@ kill_orphaned_pgrp(struct task_struct *tsk, struct 
> task_struct *parent)
>  void mm_update_next_owner(struct mm_struct *mm)
>  {
>   struct task_struct *c, *g, *p = current;
> + struct mm_struct *tmp;
> + struct list_head *n;
>  
>  retry:
>   /*
> @@ -440,21 +442,49 @@ void mm_update_next_owner(struct mm_struct *mm)
>   if (c->mm == mm)
>   goto new_owner;
>   }
> + read_unlock(&tasklist_lock);
>  
>   /*
>* Search through everything else, we should not get here often.
>*/
> + rcu_read_lock();
>   for_each_process(g) {
> + /*
> +  * g->signal, g->mm and g->flags initialization of a just
> +  * created task must not reorder with linking the task to
> +  * tasks list. Pairs with smp_mb() implied by alloc_pid().
> +  */
> + smp_rmb();
>   if (g->flags & PF_KTHREAD)
>   continue;
>   for_each_thread(g, c) {
> - if (c->mm == mm)
> - goto new_owner;
> - if (c->mm)
> + /*
> +  * Make visible mm of iterated thread.
> +  * Pairs with smp_mb() implied by alloc_pid().
> +  */
> + if (c != g)
> + smp_rmb();
> + tmp = READ_ONCE(c->mm);
> + if (tmp == mm)
> + goto new_owner_nolock;
> + if (likely(tmp))
>   break;
> + n = READ_ONCE(c->thread_node.next);
> + /*
> +  * All mm are NULL, so iterated threads already exited.
> +  * Make sure we see their children.
> +  * Pairs with smp_mb() in do_exit().
> +  */
> + if (n == &g->signal->thread_head)
> + smp_rmb();
>   }
> + /*
> +  * Children of exited thread group are visible due to the above
> +  * smp_rmb(). Threads with mm != NULL can't create a child with
> +  * the mm we're looking for. So, no additional smp_rmb() needed.
> +  */
>   }
> - read_unlock(&tasklist_lock);
> + rcu_read_unlock();
>   /*
>* We found no owner yet mm_users > 1: this implies that we are
>* most likely racing with swapoff (try_to_unuse()) o

Re: [PATCH] locking/rwsem: Synchronize task state & waiter->task of readers

2018-04-23 Thread Andrea Parri

Hi Waiman,

On Mon, Apr 23, 2018 at 12:46:12PM -0400, Waiman Long wrote:
> On 04/10/2018 01:22 PM, Waiman Long wrote:
> > It was observed occasionally in PowerPC systems that there was reader
> > who had not been woken up but that its waiter->task had been cleared.

Can you provide more details about these observations?  (links to LKML
posts, traces, applications used/micro-benchmarks, ...)


> >
> > One probable cause of this missed wakeup may be the fact that the
> > waiter->task and the task state have not been properly synchronized as
> > the lock release-acquire pair of different locks in the wakeup code path
> > does not provide a full memory barrier guarantee.

I guess that by the "pair of different locks" you mean (sem->wait_lock,
p->pi_lock), right?  BTW, __rwsem_down_write_failed_common() is calling
wake_up_q() _before_ releasing the wait_lock: did you intend to exclude
this callsite? (why?)


> So smp_store_mb()
> > is now used to set waiter->task to NULL to provide a proper memory
> > barrier for synchronization.

Mmh; the patch is not introducing an smp_store_mb()... My guess is that
you are thinking at the sequence:

smp_store_release(&waiter->task, NULL);
[...]
smp_mb(); /* added with your patch */

or what am I missing?


> >
> > Signed-off-by: Waiman Long 
> > ---
> >  kernel/locking/rwsem-xadd.c | 17 +
> >  1 file changed, 17 insertions(+)
> >
> > diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c
> > index e795908..b3c588c 100644
> > --- a/kernel/locking/rwsem-xadd.c
> > +++ b/kernel/locking/rwsem-xadd.c
> > @@ -209,6 +209,23 @@ static void __rwsem_mark_wake(struct rw_semaphore *sem,
> > smp_store_release(&waiter->task, NULL);
> > }
> >  
> > +   /*
> > +* To avoid missed wakeup of reader, we need to make sure
> > +* that task state and waiter->task are properly synchronized.
> > +*
> > +* wakeup sleep
> > +* -- -
> > +* __rwsem_mark_wake:   rwsem_down_read_failed*:
> > +*   [S] waiter->task [S] set_current_state(state)
> > +*   MB   MB
> > +* try_to_wake_up:
> > +*   [L] state[L] waiter->task
> > +*
> > +* For the wakeup path, the original lock release-acquire pair
> > +* does not provide enough guarantee of proper synchronization.
> > +*/
> > +   smp_mb();
> > +
> > adjustment = woken * RWSEM_ACTIVE_READ_BIAS - adjustment;
> > if (list_empty(&sem->wait_list)) {
> > /* hit end of list above */
> 
> Ping!
> 
> Any thought on this patch?
> 
> I am wondering if there is a cheaper way to apply the memory barrier
> just on architectures that need it.

try_to_wake_up() does:

raw_spin_lock_irqsave(&p->pi_lock, flags);
smp_mb__after_spinlock();
if (!(p->state & state))

My understanding is that this smp_mb__after_spinlock() provides us with
the guarantee you described above.  The smp_mb__after_spinlock() should
represent a 'cheaper way' to provide such a guarantee.

If this understanding is correct, the remaining question would be about
whether you want to rely on (and document) the smp_mb__after_spinlock()
in the callsite in question (the comment in wake_up_q()

   /*
* wake_up_process() implies a wmb() to pair with the queueing
* in wake_q_add() so as not to miss wakeups.
*/

does not appear to be suffient...).

  Andrea


> 
> Cheers,
> Longman
>

Re: [RFC PATCH v2 0/6] Documentation/features: Provide and apply 'features-refresh.sh'

2018-04-20 Thread Andrea Parri

Hi Ingo, Jon,

On Sun, Apr 08, 2018 at 06:30:21PM +0200, Andrea Parri wrote:
> Hi,
> 
> This series provides the script 'features-refresh.sh', which operates on
> the arch support status files, and it applies this script to refresh the
> status files in place; previous discussions about this series are at [1].
> 
> The series is organized as follows.
> 
>   - Patch 1/6 adds the script to 'Documentation/features/scripts/'.
> 
>   - Patch 2/6 presents the results of running the script; this run
> also printed the messages
> 
>WARNING: 'HAVE_BPF_JIT' is not a valid Kconfig
>WARNING: '__HAVE_ARCH_STRNCASECMP' is not a valid Kconfig
>WARNING: 'Optimized asm/rwsem.h' is not a valid Kconfig
>WARNING: '__HAVE_ARCH_PTE_SPECIAL' is not a valid Kconfig
> 
> to standard output.
> 
>   - Patches 3-6/6 fix each of these warnings.
> 
> (Applies on today's mainline.)
> 
> Cheers,
>   Andrea
> 
> [1] https://marc.info/?l=linux-kernel&m=152223974927255&w=2
> https://marc.info/?l=linux-kernel&m=152277458614862&w=2
> 
> Changes in v2:
>   - support negation operators in Kconfig (suggested by Ingo Molnar)
>   - reorder patches 2/6 and 3/6 (suggested by Ingo Molnar)
>   - add patches 4-6/6 (suggested by Ingo Molnar)
> 
> Andrea Parri (6):
>   Documentation/features: Add script that refreshes the arch support
> status files in place
>   Documentation/features: Refresh the arch support status files in place
>   Documentation/features/core: Add arch support status files for
> 'cBPF-JIT' and 'eBPF-JIT'
>   Documentation/features/locking: Use '!RWSEM_GENERIC_SPINLOCK' as
> Kconfig for 'rwsem-optimized'
>   Documentation/features/lib: Remove arch support status file for
> 'strncasecmp'
>   Documentation/features/vm: Remove arch support status file for
> 'pte_special'

I understand that you didn't get the chance to look into this yet ;D
please let me know if you'd like me to rebase and re-send the series.

Thanks,
  Andrea


> 
>  .../features/core/BPF-JIT/arch-support.txt | 31 ---
>  .../features/core/cBPF-JIT/arch-support.txt| 33 
>  .../features/core/eBPF-JIT/arch-support.txt| 33 
>  .../core/generic-idle-thread/arch-support.txt  |  4 +-
>  .../features/core/jump-labels/arch-support.txt |  2 +
>  .../features/core/tracehook/arch-support.txt   |  2 +
>  .../features/debug/KASAN/arch-support.txt  |  4 +-
>  .../debug/gcov-profile-all/arch-support.txt|  2 +
>  Documentation/features/debug/kgdb/arch-support.txt |  4 +-
>  .../debug/kprobes-on-ftrace/arch-support.txt   |  2 +
>  .../features/debug/kprobes/arch-support.txt|  4 +-
>  .../features/debug/kretprobes/arch-support.txt |  4 +-
>  .../features/debug/optprobes/arch-support.txt  |  4 +-
>  .../features/debug/stackprotector/arch-support.txt |  2 +
>  .../features/debug/uprobes/arch-support.txt|  6 +-
>  .../debug/user-ret-profiler/arch-support.txt   |  2 +
>  .../features/io/dma-api-debug/arch-support.txt |  2 +
>  .../features/io/dma-contiguous/arch-support.txt|  4 +-
>  .../features/io/sg-chain/arch-support.txt  |  2 +
>  .../features/lib/strncasecmp/arch-support.txt  | 31 ---
>  .../locking/cmpxchg-local/arch-support.txt |  4 +-
>  .../features/locking/lockdep/arch-support.txt  |  4 +-
>  .../locking/queued-rwlocks/arch-support.txt| 10 ++-
>  .../locking/queued-spinlocks/arch-support.txt  |  8 +-
>  .../locking/rwsem-optimized/arch-support.txt   | 10 ++-
>  .../features/perf/kprobes-event/arch-support.txt   |  6 +-
>  .../features/perf/perf-regs/arch-support.txt   |  4 +-
>  .../features/perf/perf-stackdump/arch-support.txt  |  4 +-
>  .../sched/membarrier-sync-core/arch-support.txt|  2 +
>  .../features/sched/numa-balancing/arch-support.txt |  6 +-
>  Documentation/features/scripts/features-refresh.sh | 98 
> ++
>  .../seccomp/seccomp-filter/arch-support.txt|  6 +-
>  .../time/arch-tick-broadcast/arch-support.txt  |  4 +-
>  .../features/time/clockevents/arch-support.txt |  4 +-
>  .../time/context-tracking/arch-support.txt |  2 +
>  .../features/time/irq-time-acct/arch-support.txt   |  4 +-
>  .../time/modern-timekeeping/arch-support.txt   |  2 +
>  .../features/time/virt-cpuacct/arch-support.txt|  2 +
>  .../features/vm/ELF-ASLR/arch-support.txt  |  4 +-
>  .../features/vm/PG_uncached/arch-support.txt   |  2 +
>  Documentation/features/vm/THP/arch-support.txt |  2 +
>

[PATCH 0/2] tools/memory-model: References updates

2018-04-20 Thread Andrea Parri

A couple of fixes to our references and comments: the first updating
ASPLOS information, the second adding a reference.

Cheers,
  Andrea

Cc: Alan Stern 
Cc: Will Deacon 
Cc: Peter Zijlstra 
Cc: Boqun Feng 
Cc: Nicholas Piggin 
Cc: David Howells 
Cc: Jade Alglave 
Cc: Luc Maranget 
Cc: "Paul E. McKenney" 
Cc: Akira Yokosawa 

Andrea Parri (2):
  tools/memory-model: Update ASPLOS information
  tools/memory-model: Add reference for 'Simplifying ARM concurrency'

 tools/memory-model/Documentation/references.txt | 17 -
 tools/memory-model/linux-kernel.bell|  4 ++--
 tools/memory-model/linux-kernel.cat |  4 ++--
 tools/memory-model/linux-kernel.def |  4 ++--
 4 files changed, 18 insertions(+), 11 deletions(-)

-- 
2.7.4

[PATCH 2/2] tools/memory-model: Add reference for 'Simplifying ARM concurrency'

2018-04-20 Thread Andrea Parri

The paper discusses the revised ARMv8 memory model; such revision
had an important impact on the design of the LKMM.

Signed-off-by: Andrea Parri 
Cc: Alan Stern 
Cc: Will Deacon 
Cc: Peter Zijlstra 
Cc: Boqun Feng 
Cc: Nicholas Piggin 
Cc: David Howells 
Cc: Jade Alglave 
Cc: Luc Maranget 
Cc: "Paul E. McKenney" 
Cc: Akira Yokosawa 
---
 tools/memory-model/Documentation/references.txt | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/tools/memory-model/Documentation/references.txt 
b/tools/memory-model/Documentation/references.txt
index 74f448f2616a3..b177f3e4a614d 100644
--- a/tools/memory-model/Documentation/references.txt
+++ b/tools/memory-model/Documentation/references.txt
@@ -63,6 +63,12 @@ oShaked Flur, Susmit Sarkar, Christopher Pulte, Kyndylan 
Nienhuis,
Principles of Programming Languages (POPL 2017). ACM, New York,
NY, USA, 429–442.
 
+o  Christopher Pulte, Shaked Flur, Will Deacon, Jon French,
+   Susmit Sarkar, and Peter Sewell. 2018. "Simplifying ARM concurrency:
+   multicopy-atomic axiomatic and operational models for ARMv8". In
+   Proceedings of the ACM on Programming Languages, Volume 2, Issue
+   POPL, Article No. 19. ACM, New York, NY, USA.
+
 
 Linux-kernel memory model
 =
-- 
2.7.4

[PATCH 1/2] tools/memory-model: Update ASPLOS information

2018-04-20 Thread Andrea Parri

ASPLOS 2018 was held in March: make sure this is reflected in
header comments and references.

Signed-off-by: Andrea Parri 
Cc: Alan Stern 
Cc: Will Deacon 
Cc: Peter Zijlstra 
Cc: Boqun Feng 
Cc: Nicholas Piggin 
Cc: David Howells 
Cc: Jade Alglave 
Cc: Luc Maranget 
Cc: "Paul E. McKenney" 
Cc: Akira Yokosawa 
---
 tools/memory-model/Documentation/references.txt | 11 ++-
 tools/memory-model/linux-kernel.bell|  4 ++--
 tools/memory-model/linux-kernel.cat |  4 ++--
 tools/memory-model/linux-kernel.def |  4 ++--
 4 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/tools/memory-model/Documentation/references.txt 
b/tools/memory-model/Documentation/references.txt
index ba2e34c2ec3f5..74f448f2616a3 100644
--- a/tools/memory-model/Documentation/references.txt
+++ b/tools/memory-model/Documentation/references.txt
@@ -67,11 +67,12 @@ o   Shaked Flur, Susmit Sarkar, Christopher Pulte, Kyndylan 
Nienhuis,
 Linux-kernel memory model
 =
 
-o      Andrea Parri, Alan Stern, Luc Maranget, Paul E. McKenney,
-   and Jade Alglave.  2017. "A formal model of
-   Linux-kernel memory ordering - companion webpage".
-   http://moscova.inria.fr/∼maranget/cats7/linux/. (2017). [Online;
-   accessed 30-January-2017].
+o  Jade Alglave, Luc Maranget, Paul E. McKenney, Andrea Parri, and
+   Alan Stern.  2018. "Frightening small children and disconcerting
+   grown-ups: Concurrency in the Linux kernel". In Proceedings of
+   the 23rd International Conference on Architectural Support for
+   Programming Languages and Operating Systems (ASPLOS 2018). ACM,
+   New York, NY, USA, 405-418.  Webpage: http://diy.inria.fr/linux/.
 
 o  Jade Alglave, Luc Maranget, Paul E. McKenney, Andrea Parri, and
Alan Stern.  2017.  "A formal kernel memory-ordering model (part 1)"
diff --git a/tools/memory-model/linux-kernel.bell 
b/tools/memory-model/linux-kernel.bell
index 432c7cf71b237..64f5740e0e751 100644
--- a/tools/memory-model/linux-kernel.bell
+++ b/tools/memory-model/linux-kernel.bell
@@ -5,10 +5,10 @@
  * Copyright (C) 2017 Alan Stern ,
  *Andrea Parri 
  *
- * An earlier version of this file appears in the companion webpage for
+ * An earlier version of this file appeared in the companion webpage for
  * "Frightening small children and disconcerting grown-ups: Concurrency
  * in the Linux kernel" by Alglave, Maranget, McKenney, Parri, and Stern,
- * which is to appear in ASPLOS 2018.
+ * which appeared in ASPLOS 2018.
  *)
 
 "Linux-kernel memory consistency model"
diff --git a/tools/memory-model/linux-kernel.cat 
b/tools/memory-model/linux-kernel.cat
index 1e5c4653dd12e..59b5cbe6b6240 100644
--- a/tools/memory-model/linux-kernel.cat
+++ b/tools/memory-model/linux-kernel.cat
@@ -5,10 +5,10 @@
  * Copyright (C) 2017 Alan Stern ,
  *Andrea Parri 
  *
- * An earlier version of this file appears in the companion webpage for
+ * An earlier version of this file appeared in the companion webpage for
  * "Frightening small children and disconcerting grown-ups: Concurrency
  * in the Linux kernel" by Alglave, Maranget, McKenney, Parri, and Stern,
- * which is to appear in ASPLOS 2018.
+ * which appeared in ASPLOS 2018.
  *)
 
 "Linux-kernel memory consistency model"
diff --git a/tools/memory-model/linux-kernel.def 
b/tools/memory-model/linux-kernel.def
index f0553bd37c085..6fa3eb28d40b5 100644
--- a/tools/memory-model/linux-kernel.def
+++ b/tools/memory-model/linux-kernel.def
@@ -1,9 +1,9 @@
 // SPDX-License-Identifier: GPL-2.0+
 //
-// An earlier version of this file appears in the companion webpage for
+// An earlier version of this file appeared in the companion webpage for
 // "Frightening small children and disconcerting grown-ups: Concurrency
 // in the Linux kernel" by Alglave, Maranget, McKenney, Parri, and Stern,
-// which is to appear in ASPLOS 2018.
+// which appeared in ASPLOS 2018.
 
 // ONCE
 READ_ONCE(X) __load{once}(X)
-- 
2.7.4

[PATCH] MAINTAINERS: Update e-mail address for Andrea Parri

2018-04-20 Thread Andrea Parri

I moved to Amarula Solutions; switch to work e-mail address.

Signed-off-by: Andrea Parri 
Cc: Alan Stern 
Cc: Will Deacon 
Cc: Peter Zijlstra 
Cc: Boqun Feng 
Cc: Nicholas Piggin 
Cc: David Howells 
Cc: Jade Alglave 
Cc: Luc Maranget 
Cc: "Paul E. McKenney" 
Cc: Akira Yokosawa 
---
 MAINTAINERS | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 27ffa56e75500..2544827564f82 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8153,7 +8153,7 @@ F:drivers/misc/lkdtm*
 
 LINUX KERNEL MEMORY CONSISTENCY MODEL (LKMM)
 M: Alan Stern 
-M:     Andrea Parri 
+M:     Andrea Parri 
 M: Will Deacon 
 M: Peter Zijlstra 
 M: Boqun Feng 
-- 
2.7.4

Re: [PATCH RFC tools/memory-model 4/5] tools/memory-model: Add model support for spin_is_locked

2018-04-18 Thread Andrea Parri

On Mon, Apr 16, 2018 at 09:22:50AM -0700, Paul E. McKenney wrote:
> From: Luc Maranget 
> 
> This commit first adds a trivial macro for spin_is_locked() to
> linux-kernel.def.
> 
> It also adds cat code for enumerating all possible matches of lock
> write events (set LKW) with islocked events returning true (set RL,
> for Read from Lock), and unlock write events (set UL) with islocked
> events returning false (set RU, for Read from Unlock).  Note that this
> intentionally does not model uniprocessor kernels (CONFIG_SMP=n) built
> with CONFIG_DEBUG_SPINLOCK=n, in which spin_is_locked() unconditionally
> returns zero.
> 
> It also adds a pair of litmus tests demonstrating the minimal ordering
> provided by spin_is_locked() in conjunction with spin_lock().  Will Deacon
> noted that this minimal ordering happens on ARMv8:
> https://lkml.kernel.org/r/20180226162426.gb17...@arm.com
> 
> Notice that herd7 installations strictly older than version 7.49
> do not handle the new constructs.
> 
> Signed-off-by: Luc Maranget 
> Cc: Alan Stern 
> Cc: Will Deacon 
> Cc: Peter Zijlstra 
> Cc: Boqun Feng 
> Cc: Nicholas Piggin 
> Cc: David Howells 
> Cc: Jade Alglave 
> Cc: Luc Maranget 
> Cc: "Paul E. McKenney" 
> Cc: Akira Yokosawa 
> Cc: Ingo Molnar 
> Signed-off-by: Paul E. McKenney 

I understand that it's acceptable to not list all maintainers in the
commit message, but that does look like an omission...


> ---
>  tools/memory-model/linux-kernel.def|  1 +
>  .../MP+polockmbonce+poacquiresilsil.litmus | 30 
>  .../MP+polockonce+poacquiresilsil.litmus   | 29 
>  tools/memory-model/litmus-tests/README | 10 
>  tools/memory-model/lock.cat| 53 
> --
>  5 files changed, 119 insertions(+), 4 deletions(-)
>  create mode 100644 
> tools/memory-model/litmus-tests/MP+polockmbonce+poacquiresilsil.litmus
>  create mode 100644 
> tools/memory-model/litmus-tests/MP+polockonce+poacquiresilsil.litmus
> 
> diff --git a/tools/memory-model/linux-kernel.def 
> b/tools/memory-model/linux-kernel.def
> index 6bd3bc431b3d..f0553bd37c08 100644
> --- a/tools/memory-model/linux-kernel.def
> +++ b/tools/memory-model/linux-kernel.def
> @@ -38,6 +38,7 @@ cmpxchg_release(X,V,W) __cmpxchg{release}(X,V,W)
>  spin_lock(X) { __lock(X); }
>  spin_unlock(X) { __unlock(X); }
>  spin_trylock(X) __trylock(X)
> +spin_is_locked(X) __islocked(X)
>  
>  // RCU
>  rcu_read_lock() { __fence{rcu-lock}; }
> diff --git 
> a/tools/memory-model/litmus-tests/MP+polockmbonce+poacquiresilsil.litmus 
> b/tools/memory-model/litmus-tests/MP+polockmbonce+poacquiresilsil.litmus
> new file mode 100644
> index ..37357404a08d
> --- /dev/null
> +++ b/tools/memory-model/litmus-tests/MP+polockmbonce+poacquiresilsil.litmus
> @@ -0,0 +1,30 @@
> +C MP+polockmbonce+poacquiresilsil
> +
> +(*
> + * Result: Never
> + *
> + * Do spinlocks combined with smp_mb__after_spinlock() provide order
> + * to outside observers using spin_is_locked() to sense the lock-held
> + * state, ordered by acquire?  Note that when the first spin_is_locked()
> + * returns false and the second true, we know that the smp_load_acquire()
> + * executed before the lock was acquired (loosely speaking).
> + *)
> +
> +{
> +}
> +
> +P0 (spinlock_t *lo, int *x) {
> + spin_lock(lo);
> + smp_mb__after_spinlock();
> + WRITE_ONCE(*x,1);
> + spin_unlock(lo);
> +}
> +
> +P1 (spinlock_t *lo, int *x) {
> + int r1; int r2; int r3;
> + r1 = smp_load_acquire(x);
> +r2 = spin_is_locked(lo);
> + r3 = spin_is_locked(lo);
> +}
> +
> +exists (1:r1=1 /\ 1:r2=0 /\ 1:r3=1)
> diff --git 
> a/tools/memory-model/litmus-tests/MP+polockonce+poacquiresilsil.litmus 
> b/tools/memory-model/litmus-tests/MP+polockonce+poacquiresilsil.litmus
> new file mode 100644
> index ..ebc2668f95ff
> --- /dev/null
> +++ b/tools/memory-model/litmus-tests/MP+polockonce+poacquiresilsil.litmus
> @@ -0,0 +1,29 @@
> +C MP+polockonce+poacquiresilsil
> +
> +(*
> + * Result: Sometimes
> + *
> + * Do spinlocks provide order to outside observers using spin_is_locked()
> + * to sense the lock-held state, ordered by acquire?  Note that when the
> + * first spin_is_locked() returns false and the second true, we know that
> + * the smp_load_acquire() executed before the lock was acquired (loosely
> + * speaking).
> + *)
> +
> +{
> +}
> +
> +P0 (spinlock_t *lo, int *x) {
> + spin_lock(lo);
> + WRITE_ONCE(*x,1);
> + spin_unlock(lo);
> +}
> +
> +P1 (spinlock_t *lo, int *x) {
> + int r1; int r2; int r3;
> + r1 = smp_load_acquire(x);
> +r2 = spin_is_locked(lo);
> + r3 = spin_is_locked(lo);
> +}
> +
> +exists (1:r1=1 /\ 1:r2=0 /\ 1:r3=1)

Please fix the style in the above litmus tests (c.f., e.g., your 2/5).


> diff --git a/tools/memory-model/litmus-tests/README 
> b/tools/memory-model/litmus-tests/README
> index 04096fb8b8d9..6919909bbd0f 100644
> --- a/tools/memory-model/lit

Re: [PATCH RFC tools/memory-model 2/5] tools/memory-model: Add litmus test for multicopy atomicity

2018-04-18 Thread Andrea Parri

On Mon, Apr 16, 2018 at 09:22:48AM -0700, Paul E. McKenney wrote:
> This commit adds a litmus test suggested by Alan Stern that is forbidden
> on multicopy atomic systems, but allowed on non-multicopy atomic systems.
> Note that other-multicopy atomic systems are examples of non-multicopy
> atomic systems.
> 
> Suggested-by: Alan Stern 
> Signed-off-by: Paul E. McKenney 
> ---
>  .../litmus-tests/SB+poonceoncescoh.litmus  | 31 
> ++
>  1 file changed, 31 insertions(+)
>  create mode 100644 tools/memory-model/litmus-tests/SB+poonceoncescoh.litmus

We seem to be missing an entry in litmus-tests/README...


> 
> diff --git a/tools/memory-model/litmus-tests/SB+poonceoncescoh.litmus 
> b/tools/memory-model/litmus-tests/SB+poonceoncescoh.litmus
> new file mode 100644
> index ..991a2d6dec63
> --- /dev/null
> +++ b/tools/memory-model/litmus-tests/SB+poonceoncescoh.litmus
> @@ -0,0 +1,31 @@
> +C SB+poonceoncescoh
> +
> +(*
> + * Result: Sometimes
> + *
> + * This litmus test demonstrates that LKMM is not multicopy atomic.
> + *)
> +
> +{}
> +
> +P0(int *x, int *y)
> +{
> + int r1;
> + int r2;
> +
> + WRITE_ONCE(*x, 1);
> + r1 = READ_ONCE(*x);
> + r2 = READ_ONCE(*y);
> +}
> +
> +P1(int *x, int *y)
> +{
> + int r3;
> + int r4;
> +
> + WRITE_ONCE(*y, 1);
> + r3 = READ_ONCE(*y);
> + r4 = READ_ONCE(*x);
> +}
> +
> +exists (0:r2=0 /\ 1:r4=0 /\ 0:r1=1 /\ 1:r3=1)

This test has a normalised name:  why don't use that?

  Andrea


> -- 
> 2.5.2
>

Re: [PATCH 0/2] tools/memory-model: Model 'smp_store_mb()'

2018-04-13 Thread Andrea Parri

On Thu, Apr 12, 2018 at 02:06:27PM -0700, Paul E. McKenney wrote:
> On Thu, Apr 12, 2018 at 02:22:48PM +0200, Andrea Parri wrote:
> > Hi,
> > 
> > This (tiny) series adds 'smp_store_mb()' to the model (patch 1/2), and
> > it fixes a stylistic discrepancy in 'linux-kernel.def (patch 2/2).
> 
> I applied them both, thank you!
> 
> I had to apply 2/2 by hand for reasons that are not at all clear to
> me.  Please check to make sure I got it right.

It's OK for me.  Thanks,

  Andrea


> 
>       Thanx, Paul
> 
> > Cheers,
> >   Andrea
> > 
> > Andrea Parri (2):
> >   tools/memory-model: Model 'smp_store_mb()'
> >   tools/memory-model: Fix coding style in 'linux-kernel.def'
> > 
> >  tools/memory-model/linux-kernel.def | 29 +++--
> >  1 file changed, 15 insertions(+), 14 deletions(-)
> > 
> > -- 
> > 2.7.4
> > 
>

Re: [PATCH] memory-model: fix cheat sheet typo

2018-04-13 Thread Andrea Parri

On Thu, Apr 12, 2018 at 02:18:36PM -0700, Paul E. McKenney wrote:
> On Thu, Apr 12, 2018 at 01:21:55PM +0200, Andrea Parri wrote:
> > 
> > The litmus test that first comes to my mind when I think of cumulativity
> > (at least, 'cumulativity' as intended in LKMM) is:
> > 
> >WRC+pooncerelease+rmbonceonce+Once.litmus
> 
> Removing the "cumul-fence* ;" from "let prop" does cause this test to be
> allowed, so looks plausible.
> 
> > for 'propagation', I could mention:
> > 
> >IRIW+mbonceonces+OnceOnce.litmus
> 
> And removing the "acyclic pb as propagation" causes this one to be allowed,
> so again plausible.
> 
> > (both tests are availabe in tools/memory-model/litmus-tests/). It would
> > be a nice to mention these properties in the test descriptions, indeed.
> 
> Please see below.

Matching what I had in mind ;) thanks!

  Andrea


> 
>   Thanx, Paul
> 
> > You might find it useful to also visualize the 'valid' executions (with
> > the main events/relations) associated to each of these tests; for this,
> > 
> >$ herd7 -conf linux-kernel.cfg litmus-tests/your-test.litmus \
> > -show all -gv
> > 
> > (assuming you have 'gv' installed).
> 
> 
> 
> commit 494f11d10dd7d86e4a381cbe79e77f04cb0cee04
> Author: Paul E. McKenney 
> Date:   Thu Apr 12 14:15:57 2018 -0700
> 
> EXP tools/memory-model: Flag "cumulativity" and "propagation" tests
> 
> This commit flags WRC+pooncerelease+rmbonceonce+Once.litmus as being
> forbidden by LKMM cumulativity and IRIW+mbonceonces+OnceOnce.litmus as
> being forbidden by LKMM propagation.
> 
> Suggested-by: Andrea Parri 
> Signed-off-by: Paul E. McKenney 
> 
> diff --git a/tools/memory-model/litmus-tests/IRIW+mbonceonces+OnceOnce.litmus 
> b/tools/memory-model/litmus-tests/IRIW+mbonceonces+OnceOnce.litmus
> index 50d5db9ea983..98a3716efa37 100644
> --- a/tools/memory-model/litmus-tests/IRIW+mbonceonces+OnceOnce.litmus
> +++ b/tools/memory-model/litmus-tests/IRIW+mbonceonces+OnceOnce.litmus
> @@ -7,7 +7,7 @@ C IRIW+mbonceonces+OnceOnce
>   * between each pairs of reads.  In other words, is smp_mb() sufficient to
>   * cause two different reading processes to agree on the order of a pair
>   * of writes, where each write is to a different variable by a different
> - * process?
> + * process?  This litmus test exercises LKMM's "propagation" rule.
>   *)
>  
>  {}
> diff --git a/tools/memory-model/litmus-tests/README 
> b/tools/memory-model/litmus-tests/README
> index 6919909bbd0f..178941d2a51a 100644
> --- a/tools/memory-model/litmus-tests/README
> +++ b/tools/memory-model/litmus-tests/README
> @@ -23,7 +23,8 @@ IRIW+mbonceonces+OnceOnce.litmus
>   between each pairs of reads.  In other words, is smp_mb()
>   sufficient to cause two different reading processes to agree on
>   the order of a pair of writes, where each write is to a different
> - variable by a different process?
> + variable by a different process?  This litmus test is an example
> + that is forbidden by LKMM propagation.
>  
>  IRIW+poonceonces+OnceOnce.litmus
>   Test of independent reads from independent writes with nothing
> @@ -121,6 +122,7 @@ WRC+poonceonces+Once.litmus
>  WRC+pooncerelease+rmbonceonce+Once.litmus
>   These two are members of an extension of the MP litmus-test class
>   in which the first write is moved to a separate process.
> + The second is an example that is forbidden by LKMM cumulativity.
>  
>  Z6.0+pooncelock+pooncelock+pombonce.litmus
>   Is the ordering provided by a spin_unlock() and a subsequent
> diff --git 
> a/tools/memory-model/litmus-tests/WRC+pooncerelease+rmbonceonce+Once.litmus 
> b/tools/memory-model/litmus-tests/WRC+pooncerelease+rmbonceonce+Once.litmus
> index 97fcbffde9a0..5bda4784eb34 100644
> --- 
> a/tools/memory-model/litmus-tests/WRC+pooncerelease+rmbonceonce+Once.litmus
> +++ 
> b/tools/memory-model/litmus-tests/WRC+pooncerelease+rmbonceonce+Once.litmus
> @@ -5,7 +5,8 @@ C WRC+pooncerelease+rmbonceonce+Once
>   *
>   * This litmus test is an extension of the message-passing pattern, where
>   * the first write is moved to a separate process.  Because it features
> - * a release and a read memory barrier, it should be forbidden.
> + * a release and a read memory barrier, it should be forbidden.  This
> + * litmus test exercises LKMM cumulativity.
>   *)
>  
>  {}
>

[PATCH 2/2] tools/memory-model: Fix coding style in 'linux-kernel.def'

2018-04-12 Thread Andrea Parri

Fixes white spaces around semicolons.

Signed-off-by: Andrea Parri 
---
 tools/memory-model/linux-kernel.def | 28 ++--
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/tools/memory-model/linux-kernel.def 
b/tools/memory-model/linux-kernel.def
index acf86f6f360a7..6bd3bc431b3da 100644
--- a/tools/memory-model/linux-kernel.def
+++ b/tools/memory-model/linux-kernel.def
@@ -17,12 +17,12 @@ rcu_dereference(X) __load{once}(X)
 smp_store_mb(X,V) { __store{once}(X,V); __fence{mb}; }
 
 // Fences
-smp_mb() { __fence{mb} ; }
-smp_rmb() { __fence{rmb} ; }
-smp_wmb() { __fence{wmb} ; }
-smp_mb__before_atomic() { __fence{before-atomic} ; }
-smp_mb__after_atomic() { __fence{after-atomic} ; }
-smp_mb__after_spinlock() { __fence{after-spinlock} ; }
+smp_mb() { __fence{mb}; }
+smp_rmb() { __fence{rmb}; }
+smp_wmb() { __fence{wmb}; }
+smp_mb__before_atomic() { __fence{before-atomic}; }
+smp_mb__after_atomic() { __fence{after-atomic}; }
+smp_mb__after_spinlock() { __fence{after-spinlock}; }
 
 // Exchange
 xchg(X,V)  __xchg{mb}(X,V)
@@ -35,26 +35,26 @@ cmpxchg_acquire(X,V,W) __cmpxchg{acquire}(X,V,W)
 cmpxchg_release(X,V,W) __cmpxchg{release}(X,V,W)
 
 // Spinlocks
-spin_lock(X) { __lock(X) ; }
-spin_unlock(X) { __unlock(X) ; }
+spin_lock(X) { __lock(X); }
+spin_unlock(X) { __unlock(X); }
 spin_trylock(X) __trylock(X)
 
 // RCU
 rcu_read_lock() { __fence{rcu-lock}; }
-rcu_read_unlock() { __fence{rcu-unlock};}
+rcu_read_unlock() { __fence{rcu-unlock}; }
 synchronize_rcu() { __fence{sync-rcu}; }
 synchronize_rcu_expedited() { __fence{sync-rcu}; }
 
 // Atomic
 atomic_read(X) READ_ONCE(*X)
-atomic_set(X,V) { WRITE_ONCE(*X,V) ; }
+atomic_set(X,V) { WRITE_ONCE(*X,V); }
 atomic_read_acquire(X) smp_load_acquire(X)
 atomic_set_release(X,V) { smp_store_release(X,V); }
 
-atomic_add(V,X) { __atomic_op(X,+,V) ; }
-atomic_sub(V,X) { __atomic_op(X,-,V) ; }
-atomic_inc(X)   { __atomic_op(X,+,1) ; }
-atomic_dec(X)   { __atomic_op(X,-,1) ; }
+atomic_add(V,X) { __atomic_op(X,+,V); }
+atomic_sub(V,X) { __atomic_op(X,-,V); }
+atomic_inc(X)   { __atomic_op(X,+,1); }
+atomic_dec(X)   { __atomic_op(X,-,1); }
 
 atomic_add_return(V,X) __atomic_op_return{mb}(X,+,V)
 atomic_add_return_relaxed(V,X) __atomic_op_return{once}(X,+,V)
-- 
2.7.4

[PATCH 1/2] tools/memory-model: Model 'smp_store_mb()'

2018-04-12 Thread Andrea Parri

Says that 'smp_store_mb(x, val);' is _semantically_ equivalent to
'WRITE_ONCE(x, val); smp_mb();'.

Suggested-by: Paolo Bonzini 
Suggested-by: Peter Zijlstra 
Signed-off-by: Andrea Parri 
---
 tools/memory-model/linux-kernel.def | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/memory-model/linux-kernel.def 
b/tools/memory-model/linux-kernel.def
index 397e4e67e8c84..acf86f6f360a7 100644
--- a/tools/memory-model/linux-kernel.def
+++ b/tools/memory-model/linux-kernel.def
@@ -14,6 +14,7 @@ smp_store_release(X,V) { __store{release}(*X,V); }
 smp_load_acquire(X) __load{acquire}(*X)
 rcu_assign_pointer(X,V) { __store{release}(X,V); }
 rcu_dereference(X) __load{once}(X)
+smp_store_mb(X,V) { __store{once}(X,V); __fence{mb}; }
 
 // Fences
 smp_mb() { __fence{mb} ; }
-- 
2.7.4

< 1 2 3 4 5 6 7 >

401 - 500 of 665 matches

Mail list logo