Re: [PATCH 2/7] atomics/treewide: rework ordering barriers

2018-06-05 Thread Peter Zijlstra
On Tue, Jun 05, 2018 at 03:02:56PM +0100, Mark Rutland wrote:
> Locally I've made this:
> 
> __atomic_acquire_fence()
> __atomic_release_fence()
> __atomic_pre_fence()
> __atomic_post_fence()
> 
> ... but I'm more than happy to rename however you prefer.

No that's fine. As long as we're rid of the whole mb_before_release and
friends.

Thanks!


Re: [PATCH 2/7] atomics/treewide: rework ordering barriers

2018-06-05 Thread Peter Zijlstra
On Tue, Jun 05, 2018 at 03:02:56PM +0100, Mark Rutland wrote:
> Locally I've made this:
> 
> __atomic_acquire_fence()
> __atomic_release_fence()
> __atomic_pre_fence()
> __atomic_post_fence()
> 
> ... but I'm more than happy to rename however you prefer.

No that's fine. As long as we're rid of the whole mb_before_release and
friends.

Thanks!


Re: [PATCH 2/7] atomics/treewide: rework ordering barriers

2018-06-05 Thread Mark Rutland
On Tue, Jun 05, 2018 at 03:56:16PM +0200, Peter Zijlstra wrote:
> On Tue, Jun 05, 2018 at 02:28:02PM +0100, Mark Rutland wrote:
> > On Tue, Jun 05, 2018 at 02:16:23PM +0200, Peter Zijlstra wrote:
> > > and simply using smp_mb__{before,after}_atomic for the full fence, its
> > > exactly what they were made for.
> > 
> > The snag is arch/alpha, whare we have:
> > 
> > /*
> >  * To ensure dependency ordering is preserved for the _relaxed and
> >  * _release atomics, an smp_read_barrier_depends() is unconditionally
> >  * inserted into the _relaxed variants, which are used to build the
> >  * barriered versions. To avoid redundant back-to-back fences, we can
> >  * define the _acquire and _fence versions explicitly.
> >  */
> > #define __atomic_op_acquire(op, args...)op##_relaxed(args)
> > #define __atomic_op_fence   __atomic_op_release
> > 
> > ... where alpha's smp_read_barrier_depends() is the same as
> > smp_mb_after_atomic().
> > 
> > Since alpha's non-value-returning atomics do not have the
> > smp_read_barrier_depends(), I can't just define an empty
> > smp_mb_after_atomic().
> > 
> > Thoughts?
> 
> Bah, of course there had to be a misfit.

Of course it had to be alpha. ;)

> Something along these lines then:
> 
>  __atomic_acquire_fence
>  __atomic_release_fence
>  __atomic_mb_before
>  __atomic_mb_after

Locally I've made this:

__atomic_acquire_fence()
__atomic_release_fence()
__atomic_pre_fence()
__atomic_post_fence()

... but I'm more than happy to rename however you prefer.

Thanks,
Mark.


Re: [PATCH 2/7] atomics/treewide: rework ordering barriers

2018-06-05 Thread Mark Rutland
On Tue, Jun 05, 2018 at 03:56:16PM +0200, Peter Zijlstra wrote:
> On Tue, Jun 05, 2018 at 02:28:02PM +0100, Mark Rutland wrote:
> > On Tue, Jun 05, 2018 at 02:16:23PM +0200, Peter Zijlstra wrote:
> > > and simply using smp_mb__{before,after}_atomic for the full fence, its
> > > exactly what they were made for.
> > 
> > The snag is arch/alpha, whare we have:
> > 
> > /*
> >  * To ensure dependency ordering is preserved for the _relaxed and
> >  * _release atomics, an smp_read_barrier_depends() is unconditionally
> >  * inserted into the _relaxed variants, which are used to build the
> >  * barriered versions. To avoid redundant back-to-back fences, we can
> >  * define the _acquire and _fence versions explicitly.
> >  */
> > #define __atomic_op_acquire(op, args...)op##_relaxed(args)
> > #define __atomic_op_fence   __atomic_op_release
> > 
> > ... where alpha's smp_read_barrier_depends() is the same as
> > smp_mb_after_atomic().
> > 
> > Since alpha's non-value-returning atomics do not have the
> > smp_read_barrier_depends(), I can't just define an empty
> > smp_mb_after_atomic().
> > 
> > Thoughts?
> 
> Bah, of course there had to be a misfit.

Of course it had to be alpha. ;)

> Something along these lines then:
> 
>  __atomic_acquire_fence
>  __atomic_release_fence
>  __atomic_mb_before
>  __atomic_mb_after

Locally I've made this:

__atomic_acquire_fence()
__atomic_release_fence()
__atomic_pre_fence()
__atomic_post_fence()

... but I'm more than happy to rename however you prefer.

Thanks,
Mark.


Re: [PATCH 2/7] atomics/treewide: rework ordering barriers

2018-06-05 Thread Peter Zijlstra
On Tue, Jun 05, 2018 at 02:28:02PM +0100, Mark Rutland wrote:
> On Tue, Jun 05, 2018 at 02:16:23PM +0200, Peter Zijlstra wrote:
> > and simply using smp_mb__{before,after}_atomic for the full fence, its
> > exactly what they were made for.
> 
> The snag is arch/alpha, whare we have:
> 
> /*
>  * To ensure dependency ordering is preserved for the _relaxed and
>  * _release atomics, an smp_read_barrier_depends() is unconditionally
>  * inserted into the _relaxed variants, which are used to build the
>  * barriered versions. To avoid redundant back-to-back fences, we can
>  * define the _acquire and _fence versions explicitly.
>  */
> #define __atomic_op_acquire(op, args...)op##_relaxed(args)
> #define __atomic_op_fence   __atomic_op_release
> 
> ... where alpha's smp_read_barrier_depends() is the same as
> smp_mb_after_atomic().
> 
> Since alpha's non-value-returning atomics do not have the
> smp_read_barrier_depends(), I can't just define an empty
> smp_mb_after_atomic().
> 
> Thoughts?

Bah, of course there had to be a misfit.

Something along these lines then:

 __atomic_acquire_fence
 __atomic_release_fence
 __atomic_mb_before
 __atomic_mb_after

?


Re: [PATCH 2/7] atomics/treewide: rework ordering barriers

2018-06-05 Thread Peter Zijlstra
On Tue, Jun 05, 2018 at 02:28:02PM +0100, Mark Rutland wrote:
> On Tue, Jun 05, 2018 at 02:16:23PM +0200, Peter Zijlstra wrote:
> > and simply using smp_mb__{before,after}_atomic for the full fence, its
> > exactly what they were made for.
> 
> The snag is arch/alpha, whare we have:
> 
> /*
>  * To ensure dependency ordering is preserved for the _relaxed and
>  * _release atomics, an smp_read_barrier_depends() is unconditionally
>  * inserted into the _relaxed variants, which are used to build the
>  * barriered versions. To avoid redundant back-to-back fences, we can
>  * define the _acquire and _fence versions explicitly.
>  */
> #define __atomic_op_acquire(op, args...)op##_relaxed(args)
> #define __atomic_op_fence   __atomic_op_release
> 
> ... where alpha's smp_read_barrier_depends() is the same as
> smp_mb_after_atomic().
> 
> Since alpha's non-value-returning atomics do not have the
> smp_read_barrier_depends(), I can't just define an empty
> smp_mb_after_atomic().
> 
> Thoughts?

Bah, of course there had to be a misfit.

Something along these lines then:

 __atomic_acquire_fence
 __atomic_release_fence
 __atomic_mb_before
 __atomic_mb_after

?


Re: [PATCH 2/7] atomics/treewide: rework ordering barriers

2018-06-05 Thread Mark Rutland
On Tue, Jun 05, 2018 at 02:16:23PM +0200, Peter Zijlstra wrote:
> On Tue, May 29, 2018 at 07:07:41PM +0100, Mark Rutland wrote:
> > +#ifndef __atomic_mb__after_acquire
> > +#define __atomic_mb__after_acquire smp_mb__after_atomic
> > +#endif
> > +
> > +#ifndef __atomic_mb__before_release
> > +#define __atomic_mb__before_releasesmp_mb__before_atomic
> > +#endif
> > +
> > +#ifndef __atomic_mb__before_fence
> > +#define __atomic_mb__before_fence  smp_mb__before_atomic
> > +#endif
> > +
> > +#ifndef __atomic_mb__after_fence
> > +#define __atomic_mb__after_fence   smp_mb__after_atomic
> > +#endif
> 
> I really _really_ dislike those names.. because they imply providing an
> MB before/after something else.
> 
> But that is exactly what they do not.
> 
> How about:
> 
>   __atomic_acquire_fence
>   __atomic_release_fence
>
> for the acquire/release things,

Sure, those sound fine to me.

> and simply using smp_mb__{before,after}_atomic for the full fence, its
> exactly what they were made for.

The snag is arch/alpha, whare we have:

/*
 * To ensure dependency ordering is preserved for the _relaxed and
 * _release atomics, an smp_read_barrier_depends() is unconditionally
 * inserted into the _relaxed variants, which are used to build the
 * barriered versions. To avoid redundant back-to-back fences, we can
 * define the _acquire and _fence versions explicitly.
 */
#define __atomic_op_acquire(op, args...)op##_relaxed(args)
#define __atomic_op_fence   __atomic_op_release

... where alpha's smp_read_barrier_depends() is the same as
smp_mb_after_atomic().

Since alpha's non-value-returning atomics do not have the
smp_read_barrier_depends(), I can't just define an empty
smp_mb_after_atomic().

Thoughts?

Thanks,
Mark.


Re: [PATCH 2/7] atomics/treewide: rework ordering barriers

2018-06-05 Thread Mark Rutland
On Tue, Jun 05, 2018 at 02:16:23PM +0200, Peter Zijlstra wrote:
> On Tue, May 29, 2018 at 07:07:41PM +0100, Mark Rutland wrote:
> > +#ifndef __atomic_mb__after_acquire
> > +#define __atomic_mb__after_acquire smp_mb__after_atomic
> > +#endif
> > +
> > +#ifndef __atomic_mb__before_release
> > +#define __atomic_mb__before_releasesmp_mb__before_atomic
> > +#endif
> > +
> > +#ifndef __atomic_mb__before_fence
> > +#define __atomic_mb__before_fence  smp_mb__before_atomic
> > +#endif
> > +
> > +#ifndef __atomic_mb__after_fence
> > +#define __atomic_mb__after_fence   smp_mb__after_atomic
> > +#endif
> 
> I really _really_ dislike those names.. because they imply providing an
> MB before/after something else.
> 
> But that is exactly what they do not.
> 
> How about:
> 
>   __atomic_acquire_fence
>   __atomic_release_fence
>
> for the acquire/release things,

Sure, those sound fine to me.

> and simply using smp_mb__{before,after}_atomic for the full fence, its
> exactly what they were made for.

The snag is arch/alpha, whare we have:

/*
 * To ensure dependency ordering is preserved for the _relaxed and
 * _release atomics, an smp_read_barrier_depends() is unconditionally
 * inserted into the _relaxed variants, which are used to build the
 * barriered versions. To avoid redundant back-to-back fences, we can
 * define the _acquire and _fence versions explicitly.
 */
#define __atomic_op_acquire(op, args...)op##_relaxed(args)
#define __atomic_op_fence   __atomic_op_release

... where alpha's smp_read_barrier_depends() is the same as
smp_mb_after_atomic().

Since alpha's non-value-returning atomics do not have the
smp_read_barrier_depends(), I can't just define an empty
smp_mb_after_atomic().

Thoughts?

Thanks,
Mark.


Re: [PATCH 2/7] atomics/treewide: rework ordering barriers

2018-06-05 Thread Peter Zijlstra
On Tue, May 29, 2018 at 07:07:41PM +0100, Mark Rutland wrote:
> +#ifndef __atomic_mb__after_acquire
> +#define __atomic_mb__after_acquire   smp_mb__after_atomic
> +#endif
> +
> +#ifndef __atomic_mb__before_release
> +#define __atomic_mb__before_release  smp_mb__before_atomic
> +#endif
> +
> +#ifndef __atomic_mb__before_fence
> +#define __atomic_mb__before_fencesmp_mb__before_atomic
> +#endif
> +
> +#ifndef __atomic_mb__after_fence
> +#define __atomic_mb__after_fence smp_mb__after_atomic
> +#endif

I really _really_ dislike those names.. because they imply providing an
MB before/after something else.

But that is exactly what they do not.

How about:

__atomic_acquire_fence
__atomic_release_fence

for the acquire/release things, and simply using
smp_mb__{before,after}_atomic for the full fence, its exactly what they
were made for.


Re: [PATCH 2/7] atomics/treewide: rework ordering barriers

2018-06-05 Thread Peter Zijlstra
On Tue, May 29, 2018 at 07:07:41PM +0100, Mark Rutland wrote:
> +#ifndef __atomic_mb__after_acquire
> +#define __atomic_mb__after_acquire   smp_mb__after_atomic
> +#endif
> +
> +#ifndef __atomic_mb__before_release
> +#define __atomic_mb__before_release  smp_mb__before_atomic
> +#endif
> +
> +#ifndef __atomic_mb__before_fence
> +#define __atomic_mb__before_fencesmp_mb__before_atomic
> +#endif
> +
> +#ifndef __atomic_mb__after_fence
> +#define __atomic_mb__after_fence smp_mb__after_atomic
> +#endif

I really _really_ dislike those names.. because they imply providing an
MB before/after something else.

But that is exactly what they do not.

How about:

__atomic_acquire_fence
__atomic_release_fence

for the acquire/release things, and simply using
smp_mb__{before,after}_atomic for the full fence, its exactly what they
were made for.


[PATCH 2/7] atomics/treewide: rework ordering barriers

2018-05-29 Thread Mark Rutland
Currently architectures can override __atomic_op_*() to define the barriers
used before/after a relaxed atomic when used to build acquire/release/fence
variants.

This has the unfortunate property of requiring the architecture to define the
full wrapper for the atomics, rather than just the barriers they care about,
and gets in the way of generating atomics which can be easily read.

Instead, this patch has architectures define an optional set of barriers,
__atomic_mb_{before,after}_{acquire,release,fence}(), which 
uses to build the wrappers.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland 
Cc: Boqun Feng 
Cc: Peter Zijlstra 
Cc: Will Deacon 
Cc: Andrea Parri 
---
 arch/alpha/include/asm/atomic.h   |  8 
 arch/powerpc/include/asm/atomic.h | 17 +
 arch/riscv/include/asm/atomic.h   | 17 +
 include/linux/atomic.h| 38 ++
 4 files changed, 36 insertions(+), 44 deletions(-)

I'm aware that the naming of these barriers is potentially bad, and something
like atomic_acquire_mb()/atomic_release_mb() might be better. I only really
care that the core code can use the barriers directly rather than this being
hidden in an architecture's __atomic_op_*().

Peter/Will, I'll defer to your judgement on naming.

Thanks,
Mark.

diff --git a/arch/alpha/include/asm/atomic.h b/arch/alpha/include/asm/atomic.h
index 81bf9d15dcc1..aba549a0b147 100644
--- a/arch/alpha/include/asm/atomic.h
+++ b/arch/alpha/include/asm/atomic.h
@@ -18,11 +18,11 @@
  * To ensure dependency ordering is preserved for the _relaxed and
  * _release atomics, an smp_read_barrier_depends() is unconditionally
  * inserted into the _relaxed variants, which are used to build the
- * barriered versions. To avoid redundant back-to-back fences, we can
- * define the _acquire and _fence versions explicitly.
+ * barriered versions. Avoid redundant back-to-back fences in the
+ * _acquire and _fence versions.
  */
-#define __atomic_op_acquire(op, args...)   op##_relaxed(args)
-#define __atomic_op_fence  __atomic_op_release
+#define __atomic_mb__after_acquire()
+#define __atomic_mb__after_fence()
 
 #define ATOMIC_INIT(i) { (i) }
 #define ATOMIC64_INIT(i)   { (i) }
diff --git a/arch/powerpc/include/asm/atomic.h 
b/arch/powerpc/include/asm/atomic.h
index a0156cb43d1f..39342a1602e9 100644
--- a/arch/powerpc/include/asm/atomic.h
+++ b/arch/powerpc/include/asm/atomic.h
@@ -18,18 +18,11 @@
  * a "bne-" instruction at the end, so an isync is enough as a acquire barrier
  * on the platform without lwsync.
  */
-#define __atomic_op_acquire(op, args...)   \
-({ \
-   typeof(op##_relaxed(args)) __ret  = op##_relaxed(args); \
-   __asm__ __volatile__(PPC_ACQUIRE_BARRIER "" : : : "memory");\
-   __ret;  \
-})
-
-#define __atomic_op_release(op, args...)   \
-({ \
-   __asm__ __volatile__(PPC_RELEASE_BARRIER "" : : : "memory");\
-   op##_relaxed(args); \
-})
+#define __atomic_mb__after_acquire()   \
+   __asm__ __volatile__(PPC_ACQUIRE_BARRIER "" : : : "memory")
+
+#define __atomic_mb__before_release()  \
+   __asm__ __volatile__(PPC_RELEASE_BARRIER "" : : : "memory")
 
 static __inline__ int atomic_read(const atomic_t *v)
 {
diff --git a/arch/riscv/include/asm/atomic.h b/arch/riscv/include/asm/atomic.h
index 512b89485790..1d168857638a 100644
--- a/arch/riscv/include/asm/atomic.h
+++ b/arch/riscv/include/asm/atomic.h
@@ -25,18 +25,11 @@
 
 #define ATOMIC_INIT(i) { (i) }
 
-#define __atomic_op_acquire(op, args...)   \
-({ \
-   typeof(op##_relaxed(args)) __ret  = op##_relaxed(args); \
-   __asm__ __volatile__(RISCV_ACQUIRE_BARRIER "" ::: "memory");\
-   __ret;  \
-})
-
-#define __atomic_op_release(op, args...)   \
-({ \
-   __asm__ __volatile__(RISCV_RELEASE_BARRIER "" ::: "memory");\
-   op##_relaxed(args); \
-})
+#define __atomic_mb__after_acquire()   \
+   __asm__ __volatile__(RISCV_ACQUIRE_BARRIER "" ::: "memory")
+
+#define __atomic_mb__before_release()  \
+   __asm__ __volatile__(RISCV_RELEASE_BARRIER "" ::: "memory");
 
 static __always_inline int atomic_read(const atomic_t *v)
 {
diff --git a/include/linux/atomic.h 

[PATCH 2/7] atomics/treewide: rework ordering barriers

2018-05-29 Thread Mark Rutland
Currently architectures can override __atomic_op_*() to define the barriers
used before/after a relaxed atomic when used to build acquire/release/fence
variants.

This has the unfortunate property of requiring the architecture to define the
full wrapper for the atomics, rather than just the barriers they care about,
and gets in the way of generating atomics which can be easily read.

Instead, this patch has architectures define an optional set of barriers,
__atomic_mb_{before,after}_{acquire,release,fence}(), which 
uses to build the wrappers.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland 
Cc: Boqun Feng 
Cc: Peter Zijlstra 
Cc: Will Deacon 
Cc: Andrea Parri 
---
 arch/alpha/include/asm/atomic.h   |  8 
 arch/powerpc/include/asm/atomic.h | 17 +
 arch/riscv/include/asm/atomic.h   | 17 +
 include/linux/atomic.h| 38 ++
 4 files changed, 36 insertions(+), 44 deletions(-)

I'm aware that the naming of these barriers is potentially bad, and something
like atomic_acquire_mb()/atomic_release_mb() might be better. I only really
care that the core code can use the barriers directly rather than this being
hidden in an architecture's __atomic_op_*().

Peter/Will, I'll defer to your judgement on naming.

Thanks,
Mark.

diff --git a/arch/alpha/include/asm/atomic.h b/arch/alpha/include/asm/atomic.h
index 81bf9d15dcc1..aba549a0b147 100644
--- a/arch/alpha/include/asm/atomic.h
+++ b/arch/alpha/include/asm/atomic.h
@@ -18,11 +18,11 @@
  * To ensure dependency ordering is preserved for the _relaxed and
  * _release atomics, an smp_read_barrier_depends() is unconditionally
  * inserted into the _relaxed variants, which are used to build the
- * barriered versions. To avoid redundant back-to-back fences, we can
- * define the _acquire and _fence versions explicitly.
+ * barriered versions. Avoid redundant back-to-back fences in the
+ * _acquire and _fence versions.
  */
-#define __atomic_op_acquire(op, args...)   op##_relaxed(args)
-#define __atomic_op_fence  __atomic_op_release
+#define __atomic_mb__after_acquire()
+#define __atomic_mb__after_fence()
 
 #define ATOMIC_INIT(i) { (i) }
 #define ATOMIC64_INIT(i)   { (i) }
diff --git a/arch/powerpc/include/asm/atomic.h 
b/arch/powerpc/include/asm/atomic.h
index a0156cb43d1f..39342a1602e9 100644
--- a/arch/powerpc/include/asm/atomic.h
+++ b/arch/powerpc/include/asm/atomic.h
@@ -18,18 +18,11 @@
  * a "bne-" instruction at the end, so an isync is enough as a acquire barrier
  * on the platform without lwsync.
  */
-#define __atomic_op_acquire(op, args...)   \
-({ \
-   typeof(op##_relaxed(args)) __ret  = op##_relaxed(args); \
-   __asm__ __volatile__(PPC_ACQUIRE_BARRIER "" : : : "memory");\
-   __ret;  \
-})
-
-#define __atomic_op_release(op, args...)   \
-({ \
-   __asm__ __volatile__(PPC_RELEASE_BARRIER "" : : : "memory");\
-   op##_relaxed(args); \
-})
+#define __atomic_mb__after_acquire()   \
+   __asm__ __volatile__(PPC_ACQUIRE_BARRIER "" : : : "memory")
+
+#define __atomic_mb__before_release()  \
+   __asm__ __volatile__(PPC_RELEASE_BARRIER "" : : : "memory")
 
 static __inline__ int atomic_read(const atomic_t *v)
 {
diff --git a/arch/riscv/include/asm/atomic.h b/arch/riscv/include/asm/atomic.h
index 512b89485790..1d168857638a 100644
--- a/arch/riscv/include/asm/atomic.h
+++ b/arch/riscv/include/asm/atomic.h
@@ -25,18 +25,11 @@
 
 #define ATOMIC_INIT(i) { (i) }
 
-#define __atomic_op_acquire(op, args...)   \
-({ \
-   typeof(op##_relaxed(args)) __ret  = op##_relaxed(args); \
-   __asm__ __volatile__(RISCV_ACQUIRE_BARRIER "" ::: "memory");\
-   __ret;  \
-})
-
-#define __atomic_op_release(op, args...)   \
-({ \
-   __asm__ __volatile__(RISCV_RELEASE_BARRIER "" ::: "memory");\
-   op##_relaxed(args); \
-})
+#define __atomic_mb__after_acquire()   \
+   __asm__ __volatile__(RISCV_ACQUIRE_BARRIER "" ::: "memory")
+
+#define __atomic_mb__before_release()  \
+   __asm__ __volatile__(RISCV_RELEASE_BARRIER "" ::: "memory");
 
 static __always_inline int atomic_read(const atomic_t *v)
 {
diff --git a/include/linux/atomic.h