barrier.h: redefine rmb and wmb to lwsync

Kautuk Consul Wed, 22 Feb 2023 01:04:23 -0800

A link from ibm.com states:
"Ensures that all instructions preceding the call to __lwsync
 complete before any subsequent store instructions can be executed
 on the processor that executed the function. Also, it ensures that
 all load instructions preceding the call to __lwsync complete before
 any subsequent load instructions can be executed on the processor
 that executed the function. This allows you to synchronize between
 multiple processors with minimal performance impact, as __lwsync
 does not wait for confirmation from each processor."


Thats why smp_rmb() and smp_wmb() are defined to lwsync.
But this same understanding applies to parallel pipeline
execution on each PowerPC processor.
So, use the lwsync instruction for rmb() and wmb() on the PPC
architectures that support it.

Signed-off-by: Kautuk Consul <kcon...@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/barrier.h | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/arch/powerpc/include/asm/barrier.h 
b/arch/powerpc/include/asm/barrier.h
index b95b666f0374..e088dacc0ee8 100644
--- a/arch/powerpc/include/asm/barrier.h
+++ b/arch/powerpc/include/asm/barrier.h
@@ -36,8 +36,15 @@
  * heavy-weight sync, so smp_wmb() can be a lighter-weight eieio.
  */
 #define __mb()   __asm__ __volatile__ ("sync" : : : "memory")
+
+/* The sub-arch has lwsync. */
+#if defined(CONFIG_PPC64) || defined(CONFIG_PPC_E500MC)
+#define __rmb() __asm__ __volatile__ ("lwsync" : : : "memory")
+#define __wmb() __asm__ __volatile__ ("lwsync" : : : "memory")
+#else
 #define __rmb()  __asm__ __volatile__ ("sync" : : : "memory")
 #define __wmb()  __asm__ __volatile__ ("sync" : : : "memory")
+#endif
 
 /* The sub-arch has lwsync */
 #if defined(CONFIG_PPC64) || defined(CONFIG_PPC_E500MC)
-- 
2.31.1

[PATCH v2] arch/powerpc/include/asm/barrier.h: redefine rmb and wmb to lwsync

Reply via email to