Re: [PATCH, rs6000] Improve atomic_load/store code gen for Power8 TI mode

2014-04-28 Thread Pat Haugen

On 04/09/2014 02:56 PM, David Edelsohn wrote:

I have reverted this on trunk and asked Bill to revert this on the 4.8
branch. This patch is too risky to apply this close to a freeze for
4.9.
I received approval off list for an updated variant of the patch for 
4.8, so this patch has now been (re)committed to 4.8/4.9/trunk.


-Pat



Re: [PATCH, rs6000] Improve atomic_load/store code gen for Power8 TI mode

2014-04-09 Thread Bill Schmidt
On Tue, 2014-04-08 at 13:39 -0500, Pat Haugen wrote:
 On 03/25/2014 11:20 AM, Pat Haugen wrote:
  Power8 can use lq/stq instructions for TI mode atomic_load/store. 
  Bootstrap/regtest with no new failures. Ok for trunk and 4.8 (once 
  bootstrap/regtest finishes)?
 
  -Pat
 
 
  2014-03-25  Pat Haugen pthau...@us.ibm.com
 
  * config/rs6000/sync.md (AINT mode_iterator): Move definition.
  (loadsync_mode): Change mode.
  (atomic_loadmode, atomic_storemode): Add support for TI mode.
  (load_quadpti, store_quadpti): New.
  * config/rs6000/rs6000.md (unspec enum): Add UNSPEC_LSQ.
 
  gcc/testsuite:
  * gcc.target/powerpc/atomic_load_store-p8.c: New.
 
 Updated patch which was approved off list and I have committed.
 

Unfortunately this broke bootstrap on powerpc64le-linux-gnu on 4.8:

checking for suffix of executables... /home/wschmidt/gcc/gcc-4_8-base/libatomic\
/load_n.c: In function 'libat_load_16':
/home/wschmidt/gcc/gcc-4_8-base/libatomic/load_n.c:58:31: error: invalid failur\
e memory model for '__atomic_compare_exchange'
 atomic_compare_exchange_n (mptr, t, 0, true,
   ^
make[4]: *** [load_16_.lo] Error 1
make[4]: *** Waiting for unfinished jobs

Thanks,
Bill



Re: [PATCH, rs6000] Improve atomic_load/store code gen for Power8 TI mode

2014-04-09 Thread David Edelsohn
I have reverted this on trunk and asked Bill to revert this on the 4.8
branch. This patch is too risky to apply this close to a freeze for
4.9.

Sorry for the problems.

- David


On Wed, Apr 9, 2014 at 2:56 PM, Bill Schmidt
wschm...@linux.vnet.ibm.com wrote:
 On Tue, 2014-04-08 at 13:39 -0500, Pat Haugen wrote:
 On 03/25/2014 11:20 AM, Pat Haugen wrote:
  Power8 can use lq/stq instructions for TI mode atomic_load/store.
  Bootstrap/regtest with no new failures. Ok for trunk and 4.8 (once
  bootstrap/regtest finishes)?
 
  -Pat
 
 
  2014-03-25  Pat Haugen pthau...@us.ibm.com
 
  * config/rs6000/sync.md (AINT mode_iterator): Move definition.
  (loadsync_mode): Change mode.
  (atomic_loadmode, atomic_storemode): Add support for TI mode.
  (load_quadpti, store_quadpti): New.
  * config/rs6000/rs6000.md (unspec enum): Add UNSPEC_LSQ.
 
  gcc/testsuite:
  * gcc.target/powerpc/atomic_load_store-p8.c: New.

 Updated patch which was approved off list and I have committed.


 Unfortunately this broke bootstrap on powerpc64le-linux-gnu on 4.8:

 checking for suffix of executables... 
 /home/wschmidt/gcc/gcc-4_8-base/libatomic\
 /load_n.c: In function 'libat_load_16':
 /home/wschmidt/gcc/gcc-4_8-base/libatomic/load_n.c:58:31: error: invalid 
 failur\
 e memory model for '__atomic_compare_exchange'
  atomic_compare_exchange_n (mptr, t, 0, true,
^
 make[4]: *** [load_16_.lo] Error 1
 make[4]: *** Waiting for unfinished jobs

 Thanks,
 Bill



Re: [PATCH, rs6000] Improve atomic_load/store code gen for Power8 TI mode

2014-04-09 Thread Bill Schmidt
On Wed, 2014-04-09 at 15:56 -0400, David Edelsohn wrote:
 I have reverted this on trunk and asked Bill to revert this on the 4.8
 branch. This patch is too risky to apply this close to a freeze for
 4.9.

I've reverted this on 4.8 as r209254.

Thanks,
Bill

 
 Sorry for the problems.
 
 - David
 
 
 On Wed, Apr 9, 2014 at 2:56 PM, Bill Schmidt
 wschm...@linux.vnet.ibm.com wrote:
  On Tue, 2014-04-08 at 13:39 -0500, Pat Haugen wrote:
  On 03/25/2014 11:20 AM, Pat Haugen wrote:
   Power8 can use lq/stq instructions for TI mode atomic_load/store.
   Bootstrap/regtest with no new failures. Ok for trunk and 4.8 (once
   bootstrap/regtest finishes)?
  
   -Pat
  
  
   2014-03-25  Pat Haugen pthau...@us.ibm.com
  
   * config/rs6000/sync.md (AINT mode_iterator): Move definition.
   (loadsync_mode): Change mode.
   (atomic_loadmode, atomic_storemode): Add support for TI mode.
   (load_quadpti, store_quadpti): New.
   * config/rs6000/rs6000.md (unspec enum): Add UNSPEC_LSQ.
  
   gcc/testsuite:
   * gcc.target/powerpc/atomic_load_store-p8.c: New.
 
  Updated patch which was approved off list and I have committed.
 
 
  Unfortunately this broke bootstrap on powerpc64le-linux-gnu on 4.8:
 
  checking for suffix of executables... 
  /home/wschmidt/gcc/gcc-4_8-base/libatomic\
  /load_n.c: In function 'libat_load_16':
  /home/wschmidt/gcc/gcc-4_8-base/libatomic/load_n.c:58:31: error: invalid 
  failur\
  e memory model for '__atomic_compare_exchange'
   atomic_compare_exchange_n (mptr, t, 0, true,
 ^
  make[4]: *** [load_16_.lo] Error 1
  make[4]: *** Waiting for unfinished jobs
 
  Thanks,
  Bill
 
 



Re: [PATCH, rs6000] Improve atomic_load/store code gen for Power8 TI mode

2014-04-08 Thread Pat Haugen

On 03/25/2014 11:20 AM, Pat Haugen wrote:
Power8 can use lq/stq instructions for TI mode atomic_load/store. 
Bootstrap/regtest with no new failures. Ok for trunk and 4.8 (once 
bootstrap/regtest finishes)?


-Pat


2014-03-25  Pat Haugen pthau...@us.ibm.com

* config/rs6000/sync.md (AINT mode_iterator): Move definition.
(loadsync_mode): Change mode.
(atomic_loadmode, atomic_storemode): Add support for TI mode.
(load_quadpti, store_quadpti): New.
* config/rs6000/rs6000.md (unspec enum): Add UNSPEC_LSQ.

gcc/testsuite:
* gcc.target/powerpc/atomic_load_store-p8.c: New.


Updated patch which was approved off list and I have committed.


Index: testsuite/gcc.target/powerpc/atomic_load_store-p8.c
===
--- testsuite/gcc.target/powerpc/atomic_load_store-p8.c	(revision 0)
+++ testsuite/gcc.target/powerpc/atomic_load_store-p8.c	(revision 0)
@@ -0,0 +1,22 @@
+/* { dg-do compile { target { powerpc*-*-*  lp64 } } } */
+/* { dg-skip-if  { powerpc*-*-darwin* } { * } {  } } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-options -mcpu=power8 -O2 } */
+/* { dg-final { scan-assembler-times lq 1 } } */
+/* { dg-final { scan-assembler-times stq 1 } } */
+/* { dg-final { scan-assembler-not bl __atomic } } */
+/* { dg-final { scan-assembler-not lqarx } } */
+/* { dg-final { scan-assembler-not stqcx } } */
+
+__int128
+atomic_load_128_relaxed (__int128 *ptr)
+{
+	return __atomic_load_n (ptr, __ATOMIC_RELAXED);
+}
+
+void
+atomic_store_128_relaxed (__int128 *ptr, __int128 val)
+{
+	__atomic_store_n (ptr, val, __ATOMIC_RELAXED);
+}
+
Index: config/rs6000/predicates.md
===
--- config/rs6000/predicates.md	(revision 209198)
+++ config/rs6000/predicates.md	(working copy)
@@ -624,14 +624,14 @@ (define_predicate offsettable_mem_opera
(match_test offsettable_nonstrict_memref_p (op
 
 ;; Return 1 if the operand is suitable for load/store quad memory.
-;; This predicate only checks for non-atomic loads/stores.
+;; This predicate only checks for non-atomic loads/stores (not lqarx/stqcx).
 (define_predicate quad_memory_operand
   (match_code mem)
 {
   rtx addr, op0, op1;
   int ret;
 
-  if (!TARGET_QUAD_MEMORY)
+  if (!TARGET_QUAD_MEMORY  !TARGET_SYNC_TI)
 ret = 0;
 
   else if (!memory_operand (op, mode))
Index: config/rs6000/sync.md
===
--- config/rs6000/sync.md	(revision 209198)
+++ config/rs6000/sync.md	(working copy)
@@ -107,10 +107,17 @@ (define_insn isync
   isync
   [(set_attr type isync)])
 
+;; Types that we should provide atomic instructions for.
+(define_mode_iterator AINT [QI
+			HI
+			SI
+			(DI TARGET_POWERPC64)
+			(TI TARGET_SYNC_TI)])
+
 ;; The control dependency used for load dependency described
 ;; in B.2.3 of the Power ISA 2.06B.
 (define_insn loadsync_mode
-  [(unspec_volatile:BLK [(match_operand:INT1 0 register_operand r)]
+  [(unspec_volatile:BLK [(match_operand:AINT 0 register_operand r)]
 			UNSPECV_ISYNC)
(clobber (match_scratch:CC 1 =y))]
   
@@ -118,18 +125,56 @@ (define_insn loadsync_mode
   [(set_attr type isync)
(set_attr length 12)])
 
+(define_insn load_quadpti
+  [(set (match_operand:PTI 0 quad_int_reg_operand =r)
+	(unspec:PTI
+	 [(match_operand:TI 1 quad_memory_operand wQ)] UNSPEC_LSQ))]
+  TARGET_SYNC_TI
+!reg_mentioned_p (operands[0], operands[1])
+  lq %0,%1
+  [(set_attr type load)
+   (set_attr length 4)])
+
 (define_expand atomic_loadmode
-  [(set (match_operand:INT1 0 register_operand )		;; output
-	(match_operand:INT1 1 memory_operand ))		;; memory
+  [(set (match_operand:AINT 0 register_operand )		;; output
+	(match_operand:AINT 1 memory_operand ))		;; memory
(use (match_operand:SI 2 const_int_operand ))]		;; model
   
 {
+  if (MODEmode == TImode  !TARGET_SYNC_TI)
+FAIL;
+
   enum memmodel model = (enum memmodel) INTVAL (operands[2]);
 
   if (model == MEMMODEL_SEQ_CST)
 emit_insn (gen_hwsync ());
 
-  emit_move_insn (operands[0], operands[1]);
+  if (MODEmode != TImode)
+emit_move_insn (operands[0], operands[1]);
+  else
+{
+  rtx op0 = operands[0];
+  rtx op1 = operands[1];
+  rtx pti_reg = gen_reg_rtx (PTImode);
+
+  // Can't have indexed address for 'lq'
+  if (indexed_address (XEXP (op1, 0), TImode))
+	{
+	  rtx old_addr = XEXP (op1, 0);
+	  rtx new_addr = force_reg (Pmode, old_addr);
+	  operands[1] = op1 = replace_equiv_address (op1, new_addr);
+	}
+
+  emit_insn (gen_load_quadpti (pti_reg, op1));
+
+  if (WORDS_BIG_ENDIAN)
+	emit_move_insn (op0, gen_lowpart (TImode, pti_reg));
+  else
+	{
+	  emit_move_insn (gen_lowpart (DImode, op0), gen_highpart (DImode, pti_reg));
+	  emit_move_insn (gen_highpart (DImode, op0), gen_lowpart (DImode, pti_reg));
+	}
+}
 
   switch (model)
 {
@@ -146,12 +191,24 @@ (define_expand 

[PATCH, rs6000] Improve atomic_load/store code gen for Power8 TI mode

2014-03-25 Thread Pat Haugen
Power8 can use lq/stq instructions for TI mode atomic_load/store. 
Bootstrap/regtest with no new failures. Ok for trunk and 4.8 (once 
bootstrap/regtest finishes)?


-Pat


2014-03-25  Pat Haugen  pthau...@us.ibm.com

* config/rs6000/sync.md (AINT mode_iterator): Move definition.
(loadsync_mode): Change mode.
(atomic_loadmode, atomic_storemode): Add support for TI mode.
(load_quadpti, store_quadpti): New.
* config/rs6000/rs6000.md (unspec enum): Add UNSPEC_LSQ.

gcc/testsuite:
* gcc.target/powerpc/atomic_load_store-p8.c: New.


Index: testsuite/gcc.target/powerpc/atomic_load_store-p8.c
===
--- testsuite/gcc.target/powerpc/atomic_load_store-p8.c	(revision 0)
+++ testsuite/gcc.target/powerpc/atomic_load_store-p8.c	(revision 0)
@@ -0,0 +1,22 @@
+/* { dg-do compile { target { powerpc*-*-*  lp64 } } } */
+/* { dg-skip-if  { powerpc*-*-darwin* } { * } {  } } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-options -mcpu=power8 -O2 } */
+/* { dg-final { scan-assembler-times lq 1 } } */
+/* { dg-final { scan-assembler-times stq 1 } } */
+/* { dg-final { scan-assembler-not bl __atomic } } */
+/* { dg-final { scan-assembler-not lqarx } } */
+/* { dg-final { scan-assembler-not stqcx } } */
+
+__int128
+atomic_load_128_relaxed (__int128 *ptr)
+{
+	return __atomic_load_n (ptr, __ATOMIC_RELAXED);
+}
+
+void
+atomic_store_128_relaxed (__int128 *ptr, __int128 val)
+{
+	__atomic_store_n (ptr, val, __ATOMIC_RELAXED);
+}
+
Index: config/rs6000/sync.md
===
--- config/rs6000/sync.md	(revision 208798)
+++ config/rs6000/sync.md	(working copy)
@@ -107,10 +107,17 @@ (define_insn isync
   isync
   [(set_attr type isync)])
 
+;; Types that we should provide atomic instructions for.
+(define_mode_iterator AINT [QI
+			HI
+			SI
+			(DI TARGET_POWERPC64)
+			(TI TARGET_SYNC_TI)])
+
 ;; The control dependency used for load dependency described
 ;; in B.2.3 of the Power ISA 2.06B.
 (define_insn loadsync_mode
-  [(unspec_volatile:BLK [(match_operand:INT1 0 register_operand r)]
+  [(unspec_volatile:BLK [(match_operand:AINT 0 register_operand r)]
 			UNSPECV_ISYNC)
(clobber (match_scratch:CC 1 =y))]
   
@@ -119,17 +126,39 @@ (define_insn loadsync_mode
(set_attr length 12)])
 
 (define_expand atomic_loadmode
-  [(set (match_operand:INT1 0 register_operand )		;; output
-	(match_operand:INT1 1 memory_operand ))		;; memory
+  [(set (match_operand:AINT 0 register_operand )		;; output
+	(match_operand:AINT 1 memory_operand ))		;; memory
(use (match_operand:SI 2 const_int_operand ))]		;; model
   
 {
+  if (MODEmode == TImode  !TARGET_QUAD_MEMORY)
+FAIL;
+
   enum memmodel model = (enum memmodel) INTVAL (operands[2]);
 
   if (model == MEMMODEL_SEQ_CST)
 emit_insn (gen_hwsync ());
 
-  emit_move_insn (operands[0], operands[1]);
+  if (MODEmode != TImode)
+emit_move_insn (operands[0], operands[1]);
+  else
+{
+  rtx op0 = operands[0];
+  rtx op1 = operands[1];
+  rtx pti_reg = gen_reg_rtx (PTImode);
+
+  // Can't have indexed address for 'lq'
+  if (indexed_address (XEXP (op1, 0), TImode))
+	{
+	  rtx old_addr = XEXP (op1, 0);
+	  rtx new_addr = force_reg (Pmode, old_addr);
+	  operands[1] = op1 = replace_equiv_address (op1, new_addr);
+	}
+
+  emit_insn (gen_load_quadpti (pti_reg, op1));
+
+  emit_move_insn (op0, gen_lowpart (TImode, pti_reg));
+}
 
   switch (model)
 {
@@ -146,12 +175,25 @@ (define_expand atomic_loadmode
   DONE;
 })
 
+(define_insn load_quadpti
+  [(set (match_operand:PTI 0 quad_int_reg_operand =r)
+	(unspec:PTI
+ [(match_operand:TI 1 quad_memory_operand wQ)] UNSPEC_LSQ))]
+  TARGET_QUAD_MEMORY
+!reg_mentioned_p (operands[0], operands[1])
+  lq %0,%1
+  [(set_attr type load)
+   (set_attr length 4)])
+
 (define_expand atomic_storemode
-  [(set (match_operand:INT1 0 memory_operand )		;; memory
-	(match_operand:INT1 1 register_operand ))		;; input
-   (use (match_operand:SI 2 const_int_operand ))]		;; model
+  [(set (match_operand:AINT 0 memory_operand )  ;; memory
+(match_operand:AINT 1 register_operand ))   ;; input
+   (use (match_operand:SI 2 const_int_operand ))]   ;; model
   
 {
+  if (MODEmode == TImode  !TARGET_QUAD_MEMORY)
+FAIL;
+
   enum memmodel model = (enum memmodel) INTVAL (operands[2]);
   switch (model)
 {
@@ -166,10 +208,38 @@ (define_expand atomic_storemode
 default:
   gcc_unreachable ();
 }
-  emit_move_insn (operands[0], operands[1]);
+  if (MODEmode != TImode)
+emit_move_insn (operands[0], operands[1]);
+  else
+{
+  rtx op0 = operands[0];
+  rtx op1 = operands[1];
+  rtx pti_reg = gen_reg_rtx (PTImode);
+
+  // Can't have indexed address for 'stq'
+  if (indexed_address (XEXP (op0, 0), TImode))
+{
+  rtx old_addr = XEXP