Re: [PATCH, rs6000] Improve atomic_load/store code gen for Power8 TI mode
On 04/09/2014 02:56 PM, David Edelsohn wrote: I have reverted this on trunk and asked Bill to revert this on the 4.8 branch. This patch is too risky to apply this close to a freeze for 4.9. I received approval off list for an updated variant of the patch for 4.8, so this patch has now been (re)committed to 4.8/4.9/trunk. -Pat
Re: [PATCH, rs6000] Improve atomic_load/store code gen for Power8 TI mode
On Tue, 2014-04-08 at 13:39 -0500, Pat Haugen wrote: On 03/25/2014 11:20 AM, Pat Haugen wrote: Power8 can use lq/stq instructions for TI mode atomic_load/store. Bootstrap/regtest with no new failures. Ok for trunk and 4.8 (once bootstrap/regtest finishes)? -Pat 2014-03-25 Pat Haugen pthau...@us.ibm.com * config/rs6000/sync.md (AINT mode_iterator): Move definition. (loadsync_mode): Change mode. (atomic_loadmode, atomic_storemode): Add support for TI mode. (load_quadpti, store_quadpti): New. * config/rs6000/rs6000.md (unspec enum): Add UNSPEC_LSQ. gcc/testsuite: * gcc.target/powerpc/atomic_load_store-p8.c: New. Updated patch which was approved off list and I have committed. Unfortunately this broke bootstrap on powerpc64le-linux-gnu on 4.8: checking for suffix of executables... /home/wschmidt/gcc/gcc-4_8-base/libatomic\ /load_n.c: In function 'libat_load_16': /home/wschmidt/gcc/gcc-4_8-base/libatomic/load_n.c:58:31: error: invalid failur\ e memory model for '__atomic_compare_exchange' atomic_compare_exchange_n (mptr, t, 0, true, ^ make[4]: *** [load_16_.lo] Error 1 make[4]: *** Waiting for unfinished jobs Thanks, Bill
Re: [PATCH, rs6000] Improve atomic_load/store code gen for Power8 TI mode
I have reverted this on trunk and asked Bill to revert this on the 4.8 branch. This patch is too risky to apply this close to a freeze for 4.9. Sorry for the problems. - David On Wed, Apr 9, 2014 at 2:56 PM, Bill Schmidt wschm...@linux.vnet.ibm.com wrote: On Tue, 2014-04-08 at 13:39 -0500, Pat Haugen wrote: On 03/25/2014 11:20 AM, Pat Haugen wrote: Power8 can use lq/stq instructions for TI mode atomic_load/store. Bootstrap/regtest with no new failures. Ok for trunk and 4.8 (once bootstrap/regtest finishes)? -Pat 2014-03-25 Pat Haugen pthau...@us.ibm.com * config/rs6000/sync.md (AINT mode_iterator): Move definition. (loadsync_mode): Change mode. (atomic_loadmode, atomic_storemode): Add support for TI mode. (load_quadpti, store_quadpti): New. * config/rs6000/rs6000.md (unspec enum): Add UNSPEC_LSQ. gcc/testsuite: * gcc.target/powerpc/atomic_load_store-p8.c: New. Updated patch which was approved off list and I have committed. Unfortunately this broke bootstrap on powerpc64le-linux-gnu on 4.8: checking for suffix of executables... /home/wschmidt/gcc/gcc-4_8-base/libatomic\ /load_n.c: In function 'libat_load_16': /home/wschmidt/gcc/gcc-4_8-base/libatomic/load_n.c:58:31: error: invalid failur\ e memory model for '__atomic_compare_exchange' atomic_compare_exchange_n (mptr, t, 0, true, ^ make[4]: *** [load_16_.lo] Error 1 make[4]: *** Waiting for unfinished jobs Thanks, Bill
Re: [PATCH, rs6000] Improve atomic_load/store code gen for Power8 TI mode
On Wed, 2014-04-09 at 15:56 -0400, David Edelsohn wrote: I have reverted this on trunk and asked Bill to revert this on the 4.8 branch. This patch is too risky to apply this close to a freeze for 4.9. I've reverted this on 4.8 as r209254. Thanks, Bill Sorry for the problems. - David On Wed, Apr 9, 2014 at 2:56 PM, Bill Schmidt wschm...@linux.vnet.ibm.com wrote: On Tue, 2014-04-08 at 13:39 -0500, Pat Haugen wrote: On 03/25/2014 11:20 AM, Pat Haugen wrote: Power8 can use lq/stq instructions for TI mode atomic_load/store. Bootstrap/regtest with no new failures. Ok for trunk and 4.8 (once bootstrap/regtest finishes)? -Pat 2014-03-25 Pat Haugen pthau...@us.ibm.com * config/rs6000/sync.md (AINT mode_iterator): Move definition. (loadsync_mode): Change mode. (atomic_loadmode, atomic_storemode): Add support for TI mode. (load_quadpti, store_quadpti): New. * config/rs6000/rs6000.md (unspec enum): Add UNSPEC_LSQ. gcc/testsuite: * gcc.target/powerpc/atomic_load_store-p8.c: New. Updated patch which was approved off list and I have committed. Unfortunately this broke bootstrap on powerpc64le-linux-gnu on 4.8: checking for suffix of executables... /home/wschmidt/gcc/gcc-4_8-base/libatomic\ /load_n.c: In function 'libat_load_16': /home/wschmidt/gcc/gcc-4_8-base/libatomic/load_n.c:58:31: error: invalid failur\ e memory model for '__atomic_compare_exchange' atomic_compare_exchange_n (mptr, t, 0, true, ^ make[4]: *** [load_16_.lo] Error 1 make[4]: *** Waiting for unfinished jobs Thanks, Bill
Re: [PATCH, rs6000] Improve atomic_load/store code gen for Power8 TI mode
On 03/25/2014 11:20 AM, Pat Haugen wrote: Power8 can use lq/stq instructions for TI mode atomic_load/store. Bootstrap/regtest with no new failures. Ok for trunk and 4.8 (once bootstrap/regtest finishes)? -Pat 2014-03-25 Pat Haugen pthau...@us.ibm.com * config/rs6000/sync.md (AINT mode_iterator): Move definition. (loadsync_mode): Change mode. (atomic_loadmode, atomic_storemode): Add support for TI mode. (load_quadpti, store_quadpti): New. * config/rs6000/rs6000.md (unspec enum): Add UNSPEC_LSQ. gcc/testsuite: * gcc.target/powerpc/atomic_load_store-p8.c: New. Updated patch which was approved off list and I have committed. Index: testsuite/gcc.target/powerpc/atomic_load_store-p8.c === --- testsuite/gcc.target/powerpc/atomic_load_store-p8.c (revision 0) +++ testsuite/gcc.target/powerpc/atomic_load_store-p8.c (revision 0) @@ -0,0 +1,22 @@ +/* { dg-do compile { target { powerpc*-*-* lp64 } } } */ +/* { dg-skip-if { powerpc*-*-darwin* } { * } { } } */ +/* { dg-require-effective-target powerpc_p8vector_ok } */ +/* { dg-options -mcpu=power8 -O2 } */ +/* { dg-final { scan-assembler-times lq 1 } } */ +/* { dg-final { scan-assembler-times stq 1 } } */ +/* { dg-final { scan-assembler-not bl __atomic } } */ +/* { dg-final { scan-assembler-not lqarx } } */ +/* { dg-final { scan-assembler-not stqcx } } */ + +__int128 +atomic_load_128_relaxed (__int128 *ptr) +{ + return __atomic_load_n (ptr, __ATOMIC_RELAXED); +} + +void +atomic_store_128_relaxed (__int128 *ptr, __int128 val) +{ + __atomic_store_n (ptr, val, __ATOMIC_RELAXED); +} + Index: config/rs6000/predicates.md === --- config/rs6000/predicates.md (revision 209198) +++ config/rs6000/predicates.md (working copy) @@ -624,14 +624,14 @@ (define_predicate offsettable_mem_opera (match_test offsettable_nonstrict_memref_p (op ;; Return 1 if the operand is suitable for load/store quad memory. -;; This predicate only checks for non-atomic loads/stores. +;; This predicate only checks for non-atomic loads/stores (not lqarx/stqcx). (define_predicate quad_memory_operand (match_code mem) { rtx addr, op0, op1; int ret; - if (!TARGET_QUAD_MEMORY) + if (!TARGET_QUAD_MEMORY !TARGET_SYNC_TI) ret = 0; else if (!memory_operand (op, mode)) Index: config/rs6000/sync.md === --- config/rs6000/sync.md (revision 209198) +++ config/rs6000/sync.md (working copy) @@ -107,10 +107,17 @@ (define_insn isync isync [(set_attr type isync)]) +;; Types that we should provide atomic instructions for. +(define_mode_iterator AINT [QI + HI + SI + (DI TARGET_POWERPC64) + (TI TARGET_SYNC_TI)]) + ;; The control dependency used for load dependency described ;; in B.2.3 of the Power ISA 2.06B. (define_insn loadsync_mode - [(unspec_volatile:BLK [(match_operand:INT1 0 register_operand r)] + [(unspec_volatile:BLK [(match_operand:AINT 0 register_operand r)] UNSPECV_ISYNC) (clobber (match_scratch:CC 1 =y))] @@ -118,18 +125,56 @@ (define_insn loadsync_mode [(set_attr type isync) (set_attr length 12)]) +(define_insn load_quadpti + [(set (match_operand:PTI 0 quad_int_reg_operand =r) + (unspec:PTI + [(match_operand:TI 1 quad_memory_operand wQ)] UNSPEC_LSQ))] + TARGET_SYNC_TI +!reg_mentioned_p (operands[0], operands[1]) + lq %0,%1 + [(set_attr type load) + (set_attr length 4)]) + (define_expand atomic_loadmode - [(set (match_operand:INT1 0 register_operand ) ;; output - (match_operand:INT1 1 memory_operand )) ;; memory + [(set (match_operand:AINT 0 register_operand ) ;; output + (match_operand:AINT 1 memory_operand )) ;; memory (use (match_operand:SI 2 const_int_operand ))] ;; model { + if (MODEmode == TImode !TARGET_SYNC_TI) +FAIL; + enum memmodel model = (enum memmodel) INTVAL (operands[2]); if (model == MEMMODEL_SEQ_CST) emit_insn (gen_hwsync ()); - emit_move_insn (operands[0], operands[1]); + if (MODEmode != TImode) +emit_move_insn (operands[0], operands[1]); + else +{ + rtx op0 = operands[0]; + rtx op1 = operands[1]; + rtx pti_reg = gen_reg_rtx (PTImode); + + // Can't have indexed address for 'lq' + if (indexed_address (XEXP (op1, 0), TImode)) + { + rtx old_addr = XEXP (op1, 0); + rtx new_addr = force_reg (Pmode, old_addr); + operands[1] = op1 = replace_equiv_address (op1, new_addr); + } + + emit_insn (gen_load_quadpti (pti_reg, op1)); + + if (WORDS_BIG_ENDIAN) + emit_move_insn (op0, gen_lowpart (TImode, pti_reg)); + else + { + emit_move_insn (gen_lowpart (DImode, op0), gen_highpart (DImode, pti_reg)); + emit_move_insn (gen_highpart (DImode, op0), gen_lowpart (DImode, pti_reg)); + } +} switch (model) { @@ -146,12 +191,24 @@ (define_expand
[PATCH, rs6000] Improve atomic_load/store code gen for Power8 TI mode
Power8 can use lq/stq instructions for TI mode atomic_load/store. Bootstrap/regtest with no new failures. Ok for trunk and 4.8 (once bootstrap/regtest finishes)? -Pat 2014-03-25 Pat Haugen pthau...@us.ibm.com * config/rs6000/sync.md (AINT mode_iterator): Move definition. (loadsync_mode): Change mode. (atomic_loadmode, atomic_storemode): Add support for TI mode. (load_quadpti, store_quadpti): New. * config/rs6000/rs6000.md (unspec enum): Add UNSPEC_LSQ. gcc/testsuite: * gcc.target/powerpc/atomic_load_store-p8.c: New. Index: testsuite/gcc.target/powerpc/atomic_load_store-p8.c === --- testsuite/gcc.target/powerpc/atomic_load_store-p8.c (revision 0) +++ testsuite/gcc.target/powerpc/atomic_load_store-p8.c (revision 0) @@ -0,0 +1,22 @@ +/* { dg-do compile { target { powerpc*-*-* lp64 } } } */ +/* { dg-skip-if { powerpc*-*-darwin* } { * } { } } */ +/* { dg-require-effective-target powerpc_p8vector_ok } */ +/* { dg-options -mcpu=power8 -O2 } */ +/* { dg-final { scan-assembler-times lq 1 } } */ +/* { dg-final { scan-assembler-times stq 1 } } */ +/* { dg-final { scan-assembler-not bl __atomic } } */ +/* { dg-final { scan-assembler-not lqarx } } */ +/* { dg-final { scan-assembler-not stqcx } } */ + +__int128 +atomic_load_128_relaxed (__int128 *ptr) +{ + return __atomic_load_n (ptr, __ATOMIC_RELAXED); +} + +void +atomic_store_128_relaxed (__int128 *ptr, __int128 val) +{ + __atomic_store_n (ptr, val, __ATOMIC_RELAXED); +} + Index: config/rs6000/sync.md === --- config/rs6000/sync.md (revision 208798) +++ config/rs6000/sync.md (working copy) @@ -107,10 +107,17 @@ (define_insn isync isync [(set_attr type isync)]) +;; Types that we should provide atomic instructions for. +(define_mode_iterator AINT [QI + HI + SI + (DI TARGET_POWERPC64) + (TI TARGET_SYNC_TI)]) + ;; The control dependency used for load dependency described ;; in B.2.3 of the Power ISA 2.06B. (define_insn loadsync_mode - [(unspec_volatile:BLK [(match_operand:INT1 0 register_operand r)] + [(unspec_volatile:BLK [(match_operand:AINT 0 register_operand r)] UNSPECV_ISYNC) (clobber (match_scratch:CC 1 =y))] @@ -119,17 +126,39 @@ (define_insn loadsync_mode (set_attr length 12)]) (define_expand atomic_loadmode - [(set (match_operand:INT1 0 register_operand ) ;; output - (match_operand:INT1 1 memory_operand )) ;; memory + [(set (match_operand:AINT 0 register_operand ) ;; output + (match_operand:AINT 1 memory_operand )) ;; memory (use (match_operand:SI 2 const_int_operand ))] ;; model { + if (MODEmode == TImode !TARGET_QUAD_MEMORY) +FAIL; + enum memmodel model = (enum memmodel) INTVAL (operands[2]); if (model == MEMMODEL_SEQ_CST) emit_insn (gen_hwsync ()); - emit_move_insn (operands[0], operands[1]); + if (MODEmode != TImode) +emit_move_insn (operands[0], operands[1]); + else +{ + rtx op0 = operands[0]; + rtx op1 = operands[1]; + rtx pti_reg = gen_reg_rtx (PTImode); + + // Can't have indexed address for 'lq' + if (indexed_address (XEXP (op1, 0), TImode)) + { + rtx old_addr = XEXP (op1, 0); + rtx new_addr = force_reg (Pmode, old_addr); + operands[1] = op1 = replace_equiv_address (op1, new_addr); + } + + emit_insn (gen_load_quadpti (pti_reg, op1)); + + emit_move_insn (op0, gen_lowpart (TImode, pti_reg)); +} switch (model) { @@ -146,12 +175,25 @@ (define_expand atomic_loadmode DONE; }) +(define_insn load_quadpti + [(set (match_operand:PTI 0 quad_int_reg_operand =r) + (unspec:PTI + [(match_operand:TI 1 quad_memory_operand wQ)] UNSPEC_LSQ))] + TARGET_QUAD_MEMORY +!reg_mentioned_p (operands[0], operands[1]) + lq %0,%1 + [(set_attr type load) + (set_attr length 4)]) + (define_expand atomic_storemode - [(set (match_operand:INT1 0 memory_operand ) ;; memory - (match_operand:INT1 1 register_operand )) ;; input - (use (match_operand:SI 2 const_int_operand ))] ;; model + [(set (match_operand:AINT 0 memory_operand ) ;; memory +(match_operand:AINT 1 register_operand )) ;; input + (use (match_operand:SI 2 const_int_operand ))] ;; model { + if (MODEmode == TImode !TARGET_QUAD_MEMORY) +FAIL; + enum memmodel model = (enum memmodel) INTVAL (operands[2]); switch (model) { @@ -166,10 +208,38 @@ (define_expand atomic_storemode default: gcc_unreachable (); } - emit_move_insn (operands[0], operands[1]); + if (MODEmode != TImode) +emit_move_insn (operands[0], operands[1]); + else +{ + rtx op0 = operands[0]; + rtx op1 = operands[1]; + rtx pti_reg = gen_reg_rtx (PTImode); + + // Can't have indexed address for 'stq' + if (indexed_address (XEXP (op0, 0), TImode)) +{ + rtx old_addr = XEXP