On Mon, Oct 09, 2017 at 05:22:48PM -0700, Paul E. McKenney wrote:
> READ_ONCE() now implies smp_read_barrier_depends(), which means that
> the instances in arpt_do_table(), ipt_do_table(), and ip6t_do_table()
> are now redundant. This commit removes them and adjusts the comments.
Similar to the p
On Fri, Jul 07, 2017 at 12:33:49PM +0200, Ingo Molnar wrote:
> [1997/04] v2.1.36:
>
> the spin_unlock_wait() primitive gets introduced as part of release()
Whee, that goes _way_ further back than I thought it did :-)
> [2017/07] v4.12:
>
> wait_task_inactive() is still alive and
On Fri, Jul 07, 2017 at 10:31:28AM +0200, Ingo Molnar wrote:
> Here's a quick list of all the use cases:
>
> net/netfilter/nf_conntrack_core.c:
>
>- This is I believe the 'original', historic spin_unlock_wait() usecase
> that
> still exists in the kernel. spin_unlock_wait() is only use
On Thu, Jul 06, 2017 at 12:49:12PM -0400, Alan Stern wrote:
> On Thu, 6 Jul 2017, Paul E. McKenney wrote:
>
> > On Thu, Jul 06, 2017 at 06:10:47PM +0200, Peter Zijlstra wrote:
> > > On Thu, Jul 06, 2017 at 08:21:10AM -0700, Paul E. McKenney wrote:
> > > > And yes
On Thu, Jul 06, 2017 at 09:20:24AM -0700, Paul E. McKenney wrote:
> On Thu, Jul 06, 2017 at 06:05:55PM +0200, Peter Zijlstra wrote:
> > On Thu, Jul 06, 2017 at 02:12:24PM +, David Laight wrote:
> > > From: Paul E. McKenney
>
> [ . . . ]
>
> > Now on the
On Thu, Jul 06, 2017 at 09:24:12AM -0700, Paul E. McKenney wrote:
> On Thu, Jul 06, 2017 at 06:10:47PM +0200, Peter Zijlstra wrote:
> > On Thu, Jul 06, 2017 at 08:21:10AM -0700, Paul E. McKenney wrote:
> > > And yes, there are architecture-specific optimizations for an
>
On Thu, Jul 06, 2017 at 08:21:10AM -0700, Paul E. McKenney wrote:
> And yes, there are architecture-specific optimizations for an
> empty spin_lock()/spin_unlock() critical section, and the current
> arch_spin_unlock_wait() implementations show some of these optimizations.
> But I expect that perfo
On Thu, Jul 06, 2017 at 02:12:24PM +, David Laight wrote:
> From: Paul E. McKenney
> > Sent: 06 July 2017 00:30
> > There is no agreed-upon definition of spin_unlock_wait()'s semantics,
> > and it appears that all callers could do just as well with a lock/unlock
> > pair. This series therefore
On Fri, Sep 02, 2016 at 08:35:55AM +0200, Manfred Spraul wrote:
> On 09/01/2016 06:41 PM, Peter Zijlstra wrote:
> >On Thu, Sep 01, 2016 at 04:30:39PM +0100, Will Deacon wrote:
> >>On Thu, Sep 01, 2016 at 05:27:52PM +0200, Manfred Spraul wrote:
> >>>Since spin_unlock_w
On Thu, Sep 01, 2016 at 04:30:39PM +0100, Will Deacon wrote:
> On Thu, Sep 01, 2016 at 05:27:52PM +0200, Manfred Spraul wrote:
> > Since spin_unlock_wait() is defined as equivalent to spin_lock();
> > spin_unlock(), the memory barrier before spin_unlock_wait() is
> > also not required.
Note that A
On Tue, Jun 07, 2016 at 08:23:15AM -0700, Paul E. McKenney wrote:
> and if the hardware is not excessively clever (bad bet, by the
> way, long term),
This ^
> > Is there something else than conditional move instructions that could
> > come to play here? Obviously a much smarter CPU could evaluate
On Tue, Jun 07, 2016 at 08:45:53PM +0800, Boqun Feng wrote:
> On Tue, Jun 07, 2016 at 02:00:16PM +0200, Peter Zijlstra wrote:
> > On Tue, Jun 07, 2016 at 07:43:15PM +0800, Boqun Feng wrote:
> > > On Mon, Jun 06, 2016 at 06:08:36PM +0200, Peter Zijlstra wrote:
> > > &g
On Tue, Jun 07, 2016 at 07:43:15PM +0800, Boqun Feng wrote:
> On Mon, Jun 06, 2016 at 06:08:36PM +0200, Peter Zijlstra wrote:
> > diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c
> > index ce2f75e32ae1..e1c29d352e0e 100644
> > --- a/kernel/locking/qspinlo
t;
> This commit therefore adds words and an example demonstrating this
> limitation of control dependencies.
>
> Reported-by: Will Deacon
> Signed-off-by: Paul E. McKenney
Acked-by: Peter Zijlstra (Intel)
>
> diff --git a/Documentation/memory-b
On Thu, Jun 02, 2016 at 06:57:00PM +0100, Will Deacon wrote:
> > This 'replaces' commit:
> >
> > 54cf809b9512 ("locking,qspinlock: Fix spin_is_locked() and
> > spin_unlock_wait()")
> >
> > and seems to still work with the test case from that thread while
> > getting rid of the extra barriers.
On Fri, Jun 03, 2016 at 06:35:37PM +0100, Will Deacon wrote:
> On Fri, Jun 03, 2016 at 03:42:49PM +0200, Peter Zijlstra wrote:
> > On Fri, Jun 03, 2016 at 01:47:34PM +0100, Will Deacon wrote:
> > > Even on x86, I think you need a fence here:
> > >
> > >
On Fri, Jun 03, 2016 at 01:47:34PM +0100, Will Deacon wrote:
> > Now, the normal atomic_foo_acquire() stuff uses smp_mb() as per
> > smp_mb__after_atomic(), its just ARM64 and PPC that go all 'funny' and
> > need this extra barrier. Blergh. So lets shelf this issue for a bit.
>
> Hmm... I certainl
On Fri, Jun 03, 2016 at 01:47:34PM +0100, Will Deacon wrote:
> Even on x86, I think you need a fence here:
>
> X86 lock
> {
> }
> P0| P1;
> MOV EAX,$1| MOV EAX,$1;
> LOCK XCHG [x],EAX | LOCK XCHG [y],EAX ;
> MOV EBX,[y] | MOV EBX,[x]
On Fri, Jun 03, 2016 at 02:23:10PM +0200, Peter Zijlstra wrote:
> On Fri, Jun 03, 2016 at 05:08:27AM -0700, Paul E. McKenney wrote:
> > On Fri, Jun 03, 2016 at 11:38:34AM +0200, Peter Zijlstra wrote:
> > > On Fri, Jun 03, 2016 at 02:48:38PM +0530, Vineet Gupta wrote:
> >
On Fri, Jun 03, 2016 at 05:08:27AM -0700, Paul E. McKenney wrote:
> On Fri, Jun 03, 2016 at 11:38:34AM +0200, Peter Zijlstra wrote:
> > On Fri, Jun 03, 2016 at 02:48:38PM +0530, Vineet Gupta wrote:
> > > On Wednesday 25 May 2016 09:27 PM, Paul E. McKenney wrote:
> > &
On Fri, Jun 03, 2016 at 02:48:38PM +0530, Vineet Gupta wrote:
> On Wednesday 25 May 2016 09:27 PM, Paul E. McKenney wrote:
> > For your example, but keeping the compiler in check:
> >
> > if (READ_ONCE(a))
> > WRITE_ONCE(b, 1);
> > smp_rmb();
> > WRITE_ONCE(c, 2);
So I thi
On Thu, Jun 02, 2016 at 06:57:00PM +0100, Will Deacon wrote:
> > +++ b/include/asm-generic/qspinlock.h
> > @@ -28,30 +28,13 @@
> > */
> > static __always_inline int queued_spin_is_locked(struct qspinlock *lock)
> > {
> > + /*
> > +* See queued_spin_unlock_wait().
> > *
> > +* An
On Thu, Jun 02, 2016 at 04:44:24PM +0200, Peter Zijlstra wrote:
> On Thu, Jun 02, 2016 at 10:24:40PM +0800, Boqun Feng wrote:
> > On Thu, Jun 02, 2016 at 01:52:02PM +0200, Peter Zijlstra wrote:
> > About spin_unlock_wait() on ppc, I actually have a fix pending review
On Thu, Jun 02, 2016 at 11:11:07PM +0800, Boqun Feng wrote:
> On Thu, Jun 02, 2016 at 04:44:24PM +0200, Peter Zijlstra wrote:
> > Let me go ponder that some :/
> >
>
> An intial thought of the fix is making queued_spin_unlock_wait() an
> atomic-nop too:
On Thu, Jun 02, 2016 at 10:24:40PM +0800, Boqun Feng wrote:
> On Thu, Jun 02, 2016 at 01:52:02PM +0200, Peter Zijlstra wrote:
> About spin_unlock_wait() on ppc, I actually have a fix pending review:
>
> http://lkml.kernel.org/r/1461130033-70898-1-git-send-email-boqun.f...@gmail.com
Since all asm/barrier.h should/must include asm-generic/barrier.h the
latter is a good place for generic infrastructure like this.
This also allows archs to override the new
smp_acquire__after_ctrl_dep().
Signed-off-by: Peter Zijlstra (Intel)
---
include/asm-generic/barrier.h | 39
Even with spin_unlock_wait() fixed, nf_conntrack_lock{,_all}() is
borken as it misses a bunch of memory barriers to order the whole
global vs local locks scheme.
Even x86 (and other TSO archs) are affected.
Signed-off-by: Peter Zijlstra (Intel)
---
net/netfilter/nf_conntrack_core.c | 18
This new form allows using hardware assisted waiting.
Some hardware (ARM64 and x86) allow monitoring an address for changes,
so by providing a pointer we can use this to replace the cpu_relax().
Requested-by: Will Deacon
Suggested-by: Linus Torvalds
Signed-off-by: Peter Zijlstra (Intel
With the modified semantics of spin_unlock_wait() a number of
explicit barriers can be removed. And update the comment for the
do_exit() usecase, as that was somewhat stale/obscure.
Signed-off-by: Peter Zijlstra (Intel)
---
ipc/sem.c |1 -
kernel/exit.c |8 ++--
kernel
Similar to -v3 in that it rewrites spin_unlock_wait() for all.
The new spin_unlock_wait() provides ACQUIRE semantics to match the RELEASE of
the spin_unlock() we waited for and thereby ensure we can fully observe its
critical section.
This fixes a number (pretty much all) spin_unlock_wait() users
g.uk
Cc: r...@codeaurora.org
Cc: vgu...@synopsys.com
Cc: james.ho...@imgtec.com
Cc: real...@gmail.com
Cc: ys...@users.sourceforge.jp
Cc: tony.l...@intel.com
Cc: cmetc...@mellanox.com
Signed-off-by: Peter Zijlstra (Intel)
---
arch/alpha/include/asm/spinlock.h|9 +++--
arch/arc/i
Introduce smp_acquire__after_ctrl_dep(), this construct is not
uncommen, but the lack of this barrier is.
Signed-off-by: Peter Zijlstra (Intel)
---
include/linux/compiler.h | 17 -
ipc/sem.c| 14 ++
2 files changed, 14 insertions(+), 17 deletions
Since TILE doesn't do read speculation, its control dependencies also
guarantee LOAD->LOAD order and we don't need the additional RMB
otherwise required to provide ACQUIRE semantics.
Acked-by: Chris Metcalf
Signed-off-by: Peter Zijlstra (Intel)
---
arch/tile/include/asm/bar
On Wed, Jun 01, 2016 at 03:07:14PM +0100, Will Deacon wrote:
> On Wed, Jun 01, 2016 at 02:45:41PM +0200, Peter Zijlstra wrote:
> > On Wed, Jun 01, 2016 at 01:13:33PM +0100, Will Deacon wrote:
> > > On Wed, Jun 01, 2016 at 02:06:54PM +0200, Peter Zijlstra wrote:
> >
> &
On Wed, Jun 01, 2016 at 09:52:14PM +0800, Boqun Feng wrote:
> On Tue, May 31, 2016 at 11:41:37AM +0200, Peter Zijlstra wrote:
> > @@ -292,7 +282,7 @@ static void sem_wait_array(struct sem_ar
> > sem = sma->sem_base + i;
> > spi
On Wed, Jun 01, 2016 at 01:13:33PM +0100, Will Deacon wrote:
> On Wed, Jun 01, 2016 at 02:06:54PM +0200, Peter Zijlstra wrote:
> > Works for me; but that would loose using cmpwait() for
> > !smp_cond_load_acquire() spins, you fine with that?
> >
> > The two convers
On Wed, Jun 01, 2016 at 01:00:10PM +0100, Will Deacon wrote:
> On Wed, Jun 01, 2016 at 11:31:58AM +0200, Peter Zijlstra wrote:
> > Will, since ARM64 seems to want to use this, does the below make sense
> > to you?
>
> Not especially -- I was going to override smp_con
On Wed, Jun 01, 2016 at 12:24:32PM +0100, Will Deacon wrote:
> > --- a/arch/arm/include/asm/spinlock.h
> > +++ b/arch/arm/include/asm/spinlock.h
> > @@ -50,8 +50,22 @@ static inline void dsb_sev(void)
> > * memory.
> > */
> >
> > -#define arch_spin_unlock_wait(lock) \
> > - do { while (arch
On Tue, May 31, 2016 at 04:01:06PM -0400, Waiman Long wrote:
> You are doing two READ_ONCE's in the smp_cond_load_acquire loop. Can we
> change it to do just one READ_ONCE, like
>
> --- a/include/asm-generic/barrier.h
> +++ b/include/asm-generic/barrier.h
> @@ -229,12 +229,18 @@ do {
> * value;
This new form allows using hardware assisted waiting.
Requested-by: Will Deacon
Suggested-by: Linus Torvalds
Signed-off-by: Peter Zijlstra (Intel)
---
include/linux/compiler.h | 25 +++--
kernel/locking/qspinlock.c | 12 ++--
kernel/sched/core.c|8
n.id.au
Cc: vgu...@synopsys.com
Cc: r...@codeaurora.org
Cc: james.ho...@imgtec.com
Cc: real...@gmail.com
Cc: tony.l...@intel.com
Cc: ys...@users.sourceforge.jp
Cc: cmetc...@mellanox.com
Signed-off-by: Peter Zijlstra (Intel)
---
arch/alpha/include/asm/spinlock.h|7 +--
arch/arc/i
Introduce smp_acquire__after_ctrl_dep(), this construct is not
uncommen, but the lack of this barrier is.
Signed-off-by: Peter Zijlstra (Intel)
---
include/linux/compiler.h | 18 +-
ipc/sem.c| 14 ++
2 files changed, 15 insertions(+), 17 deletions
Since all asm/barrier.h should/must include asm-generic/barrier.h the
latter is a good place for generic infrastructure like this.
This also allows archs to override the new
smp_acquire__after_ctrl_dep().
Signed-off-by: Peter Zijlstra (Intel)
---
arch/alpha/include/asm/spinlock.h|2
Provide the cmpwait() primitive, which will 'spin' wait for a variable
to change and use it to implement smp_cond_load_acquire().
This primitive can be implemented with hardware assist on some
platforms (ARM64, x86).
Suggested-by: Will Deacon
Signed-off-by: Peter Zijlstra (Intel)
--
Similar to -v2 in that it rewrites spin_unlock_wait() for all.
The new spin_unlock_wait() provides ACQUIRE semantics to match the RELEASE of
the spin_unlock() we waited for and thereby ensure we can fully observe its
critical section.
This fixes a number (pretty much all) spin_unlock_wait() users
Since TILE doesn't do read speculation, its control dependencies also
guarantee LOAD->LOAD order and we don't need the additional RMB
otherwise required to provide ACQUIRE semantics.
Cc: Chris Metcalf
Signed-off-by: Peter Zijlstra (Intel)
---
arch/tile/include/asm/barrier.h |
With the modified semantics of spin_unlock_wait() a number of
explicit barriers can be removed. And update the comment for the
do_exit() usecase, as that was somewhat stale/obscure.
Signed-off-by: Peter Zijlstra (Intel)
---
ipc/sem.c |1 -
kernel/exit.c |8 ++--
kernel
Even with spin_unlock_wait() fixed, nf_conntrack_lock{,_all}() is
borken as it misses a bunch of memory barriers to order the whole
global vs local locks scheme.
Even x86 (and other TSO archs) are affected.
Signed-off-by: Peter Zijlstra (Intel)
---
net/netfilter/nf_conntrack_core.c | 18
On Fri, May 27, 2016 at 03:34:13PM -0400, Chris Metcalf wrote:
> >Does TILE never speculate reads? Because in that case the control
> >dependency already provides a full load->load,store barrier and you'd
> >want smp_acquire__after_ctrl_dep() to be a barrier() instead of
> >smp_rmb().
>
> Yes, th
On Thu, May 26, 2016 at 05:10:36PM -0400, Chris Metcalf wrote:
> On 5/26/2016 10:19 AM, Peter Zijlstra wrote:
> >--- a/arch/tile/lib/spinlock_32.c
> >+++ b/arch/tile/lib/spinlock_32.c
> >@@ -72,10 +72,14 @@ void arch_spin_unlock_wait(arch_spinlock
> > if (next == c
On Fri, May 27, 2016 at 08:46:49AM +0200, Martin Schwidefsky wrote:
> > This fixes a number of spin_unlock_wait() users that (not
> > unreasonably) rely on this.
>
> All that is missing is an smp_rmb(), no?
Indeed.
> > --- a/arch/s390/include/asm/spinlock.h
> > +++ b/arch/s390/include/asm/spinl
.com
Cc: j...@parisc-linux.org
Cc: m...@ellerman.id.au
Cc: schwidef...@de.ibm.com
Cc: ys...@users.sourceforge.jp
Cc: da...@davemloft.net
Cc: cmetc...@mellanox.com
Cc: ch...@zankel.net
Signed-off-by: Peter Zijlstra (Intel)
---
arch/alpha/include/asm/spinlock.h|7 +--
arch/arc/i
Introduce smp_acquire__after_ctrl_dep(), this construct is not
uncommen, but the lack of this barrier is.
Signed-off-by: Peter Zijlstra (Intel)
---
include/linux/compiler.h | 15 ++-
ipc/sem.c| 14 ++
2 files changed, 12 insertions(+), 17 deletions
nf_conntrack_lock{,_all}() is borken as it misses a bunch of memory
barriers to order the whole global vs local locks scheme.
Even x86 (and other TSO archs) are affected.
Signed-off-by: Peter Zijlstra (Intel)
---
net/netfilter/nf_conntrack_core.c | 18 +-
1 file changed, 17
With the modified semantics of spin_unlock_wait() a number of
explicit barriers can be removed. And update the comment for the
do_exit() usecase, as that was somewhat stale/obscure.
Signed-off-by: Peter Zijlstra (Intel)
---
ipc/sem.c |1 -
kernel/exit.c |8 ++--
kernel
This new form allows using hardware assisted waiting.
Requested-by: Will Deacon
Suggested-by: Linus Torvalds
Signed-off-by: Peter Zijlstra (Intel)
---
include/linux/compiler.h | 25 +++--
kernel/locking/qspinlock.c | 12 ++--
kernel/sched/core.c|8
This version rewrites all spin_unlock_wait() implementations to provide the
acquire order against the spin_unlock() release we wait on, ensuring we can
fully observe the critical section we waited on.
It pulls in the smp_cond_acquire() rewrite because it introduces a lot more
users of it. All simp
Provide the cmpwait() primitive, which will 'spin' wait for a variable
to change and use it to implement smp_cond_load_acquire().
This primitive can be implemented with hardware assist on some
platforms (ARM64, x86).
Suggested-by: Will Deacon
Signed-off-by: Peter Zijlstra (Intel)
--
On Tue, May 24, 2016 at 10:41:43PM +0200, Manfred Spraul wrote:
> >--- a/net/netfilter/nf_conntrack_core.c
> >+++ b/net/netfilter/nf_conntrack_core.c
> >@@ -74,7 +74,18 @@ void nf_conntrack_lock(spinlock_t *lock)
> > spin_lock(lock);
> > while (unlikely(nf_conntrack_locks_all)) {
> >
On Wed, May 25, 2016 at 08:57:47AM -0700, Paul E. McKenney wrote:
> For your example, but keeping the compiler in check:
>
> if (READ_ONCE(a))
> WRITE_ONCE(b, 1);
> smp_rmb();
> WRITE_ONCE(c, 2);
>
> On x86, the smp_rmb() is as you say nothing but barrier(). Howev
On Tue, May 24, 2016 at 12:22:07PM -0400, Tejun Heo wrote:
> A delta but that particular libata usage is probably not needed now.
> The path was used while libata was gradually adding error handlers to
> the low level drivers. I don't think we don't have any left w/o one
> at this point. I'll ver
On Tue, May 24, 2016 at 09:17:13AM -0700, Linus Torvalds wrote:
> This needs to be either hidden inside the basic spinlock functions,
> _or_ it needs to be a clear and unambiguous interface. Anything that
> starts talking about control dependencies is not it.
>
> Note that this really is about na
On Tue, May 24, 2016 at 04:27:26PM +0200, Peter Zijlstra wrote:
> nf_conntrack_lock{,_all}() is borken as it misses a bunch of memory
> barriers to order the whole global vs local locks scheme.
>
> Even x86 (and other TSO archs) are affected.
>
> Signed-off-by: Pet
Introduce smp_acquire__after_ctrl_dep(), this construct is not
uncommen, but the lack of this barrier is.
Signed-off-by: Peter Zijlstra (Intel)
---
include/linux/compiler.h | 14 ++
ipc/sem.c| 14 ++
2 files changed, 12 insertions(+), 16 deletions
nf_conntrack_lock{,_all}() is borken as it misses a bunch of memory
barriers to order the whole global vs local locks scheme.
Even x86 (and other TSO archs) are affected.
Signed-off-by: Peter Zijlstra (Intel)
---
net/netfilter/nf_conntrack_core.c | 30 +-
1 file
As per recent discussions spin_unlock_wait() has an unintuitive 'feature'.
lkml.kernel.org/r/20160520053926.gc31...@linux-uzut.site
These patches pull the existing solution into generic code; annotate all
spin_unlock_wait() users and fix nf_conntrack more.
The alternative -- putting smp_acquir
27;ve waited on.
Many spin_unlock_wait() users were unaware of this issue and need
help.
Signed-off-by: Peter Zijlstra (Intel)
---
drivers/ata/libata-eh.c |4 +++-
kernel/exit.c | 14 --
kernel/sched/completion.c |7 +++
kernel/task_work.c|2 +
67 matches
Mail list logo