Re: [PING][PATCH][AARCH64]Fix PR63424 by adding sumaxminv2di3 pattern
On 10/11/14 16:55, Renlin Li wrote: On 06/11/14 15:00, Renlin Li wrote: Hi all, Dose anybody have time to review this? Kind regards, Renlin Li On 31/10/14 14:51, Renlin Li wrote: Hi all, This is a patch which will fix PR63424. It implements signed/unsigned max/min pattern for V2DI mode in terms of vcondv2div2di pattern. In this particular case, VEC_COND_EXPR (V2DImode) is generated as aarch64 target supports it (vcondmodemode for VALL). The VEC_COND_EXPR will further folded into MIN_EXPR/MAX_EXPR in dom pass unconditionally. Later in expand pass, the compiler tries to expand min_expr using standard RTL operation. It fails, because aarch64 target don't have minv2di3 pattern implemented. It then tries to generate conditional move and comparebranch sequence, all fails. At last it falls into libfunc call, no luck either. An ICE to complain about this. aarch64-none-elf toolchain has been tested on the model, no regressions. Is it Okay for trunk? gcc/ChangeLog: 2014-10-31 Renlin Li renlin...@arm.com PR target/63424 * config/aarch64/aarch64-simd.md (sumaxminv2di3): New. gcc/testsuite/ChangeLog: 2014-10-31 Renlin Li renlin...@arm.com PR target/63424 * gcc.target/aarch64/pr63424.c: New. Hi, Dose anybody have time to review this? Thank you so much! Regards, Renlin Li Ping again. Regards, Renlin Li
Re: [PATCH][AARCH64]Fix PR63424 by adding sumaxminv2di3 pattern
On 10/31/2014 03:51 PM, Renlin Li wrote: +(define_expand sumaxminv2di3 + [(parallel [ +(set (match_operand:V2DI 0 register_operand ) + (MAXMIN:V2DI (match_operand:V2DI 1 register_operand ) + (match_operand:V2DI 2 register_operand ))) +(clobber (reg:CC CC_REGNUM))])] + TARGET_SIMD There's no clobber of CC_REGNUM, so you can take that out. Otherwise it looks good. r~
Re: [PATCH][AARCH64]Fix PR63424 by adding sumaxminv2di3 pattern
On 19/11/14 11:20, Richard Henderson wrote: On 10/31/2014 03:51 PM, Renlin Li wrote: +(define_expand sumaxminv2di3 + [(parallel [ +(set (match_operand:V2DI 0 register_operand ) +(MAXMIN:V2DI (match_operand:V2DI 1 register_operand ) + (match_operand:V2DI 2 register_operand ))) +(clobber (reg:CC CC_REGNUM))])] + TARGET_SIMD There's no clobber of CC_REGNUM, so you can take that out. Otherwise it looks good. r~ committed with your suggestion. https://gcc.gnu.org/viewcvs/gcc?view=revisionrevision=217786 Regards, Renlin Li
Re: [PING][PATCH][AARCH64]Fix PR63424 by adding sumaxminv2di3 pattern
On 06/11/14 15:00, Renlin Li wrote: Hi all, Dose anybody have time to review this? Kind regards, Renlin Li On 31/10/14 14:51, Renlin Li wrote: Hi all, This is a patch which will fix PR63424. It implements signed/unsigned max/min pattern for V2DI mode in terms of vcondv2div2di pattern. In this particular case, VEC_COND_EXPR (V2DImode) is generated as aarch64 target supports it (vcondmodemode for VALL). The VEC_COND_EXPR will further folded into MIN_EXPR/MAX_EXPR in dom pass unconditionally. Later in expand pass, the compiler tries to expand min_expr using standard RTL operation. It fails, because aarch64 target don't have minv2di3 pattern implemented. It then tries to generate conditional move and comparebranch sequence, all fails. At last it falls into libfunc call, no luck either. An ICE to complain about this. aarch64-none-elf toolchain has been tested on the model, no regressions. Is it Okay for trunk? gcc/ChangeLog: 2014-10-31 Renlin Li renlin...@arm.com PR target/63424 * config/aarch64/aarch64-simd.md (sumaxminv2di3): New. gcc/testsuite/ChangeLog: 2014-10-31 Renlin Li renlin...@arm.com PR target/63424 * gcc.target/aarch64/pr63424.c: New. Hi, Dose anybody have time to review this? Thank you so much! Regards, Renlin Li
[PING][PATCH][AARCH64]Fix PR63424 by adding sumaxminv2di3 pattern
Hi all, Dose anybody have time to review this? Kind regards, Renlin Li On 31/10/14 14:51, Renlin Li wrote: Hi all, This is a patch which will fix PR63424. It implements signed/unsigned max/min pattern for V2DI mode in terms of vcondv2div2di pattern. In this particular case, VEC_COND_EXPR (V2DImode) is generated as aarch64 target supports it (vcondmodemode for VALL). The VEC_COND_EXPR will further folded into MIN_EXPR/MAX_EXPR in dom pass unconditionally. Later in expand pass, the compiler tries to expand min_expr using standard RTL operation. It fails, because aarch64 target don't have minv2di3 pattern implemented. It then tries to generate conditional move and comparebranch sequence, all fails. At last it falls into libfunc call, no luck either. An ICE to complain about this. aarch64-none-elf toolchain has been tested on the model, no regressions. Is it Okay for trunk? gcc/ChangeLog: 2014-10-31 Renlin Li renlin...@arm.com PR target/63424 * config/aarch64/aarch64-simd.md (sumaxminv2di3): New. gcc/testsuite/ChangeLog: 2014-10-31 Renlin Li renlin...@arm.com PR target/63424 * gcc.target/aarch64/pr63424.c: New.
[PATCH][AARCH64]Fix PR63424 by adding sumaxminv2di3 pattern
Hi all, This is a patch which will fix PR63424. It implements signed/unsigned max/min pattern for V2DI mode in terms of vcondv2div2di pattern. In this particular case, VEC_COND_EXPR (V2DImode) is generated as aarch64 target supports it (vcondmodemode for VALL). The VEC_COND_EXPR will further folded into MIN_EXPR/MAX_EXPR in dom pass unconditionally. Later in expand pass, the compiler tries to expand min_expr using standard RTL operation. It fails, because aarch64 target don't have minv2di3 pattern implemented. It then tries to generate conditional move and comparebranch sequence, all fails. At last it falls into libfunc call, no luck either. An ICE to complain about this. aarch64-none-elf toolchain has been tested on the model, no regressions. Is it Okay for trunk? gcc/ChangeLog: 2014-10-31 Renlin Li renlin...@arm.com PR target/63424 * config/aarch64/aarch64-simd.md (sumaxminv2di3): New. gcc/testsuite/ChangeLog: 2014-10-31 Renlin Li renlin...@arm.com PR target/63424 * gcc.target/aarch64/pr63424.c: New.From 3bfb5960ffd4e0606fb02cf553cd5dcd45340810 Mon Sep 17 00:00:00 2001 From: Renlin Li renlin...@arm.com Date: Wed, 29 Oct 2014 09:35:25 + Subject: [PATCH 1/2] fix PR63424 --- gcc/config/aarch64/aarch64-simd.md | 35 + gcc/testsuite/gcc.target/aarch64/pr63424.c | 39 2 files changed, 74 insertions(+) create mode 100644 gcc/testsuite/gcc.target/aarch64/pr63424.c diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index cab26a3..41ddbb4 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -951,6 +951,41 @@ [(set_attr type neon_minmaxq)] ) +(define_expand sumaxminv2di3 + [(parallel [ +(set (match_operand:V2DI 0 register_operand ) + (MAXMIN:V2DI (match_operand:V2DI 1 register_operand ) + (match_operand:V2DI 2 register_operand ))) +(clobber (reg:CC CC_REGNUM))])] + TARGET_SIMD +{ + enum rtx_code cmp_operator; + rtx cmp_fmt; + + switch (CODE) +{ +case UMIN: + cmp_operator = LTU; + break; +case SMIN: + cmp_operator = LT; + break; +case UMAX: + cmp_operator = GTU; + break; +case SMAX: + cmp_operator = GT; + break; +default: + gcc_unreachable (); +} + + cmp_fmt = gen_rtx_fmt_ee (cmp_operator, V2DImode, operands[1], operands[2]); + emit_insn (gen_aarch64_vcond_internalv2div2di (operands[0], operands[1], + operands[2], cmp_fmt, operands[1], operands[2])); + DONE; +}) + ;; vec_concat gives a new vector with the low elements from operand 1, and ;; the high elements from operand 2. That is to say, given op1 = { a, b } ;; op2 = { c, d }, vec_concat (op1, op2) = { a, b, c, d }. diff --git a/gcc/testsuite/gcc.target/aarch64/pr63424.c b/gcc/testsuite/gcc.target/aarch64/pr63424.c new file mode 100644 index 000..c6bd762 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/pr63424.c @@ -0,0 +1,39 @@ +/* { dg-do compile } */ +/* { dg-options -O3 } */ + +#include stdint.h + +uint32_t +truncate_int (const unsigned long long value) +{ + if ( value 0 ) +{ + return 0; +} + else if ( value UINT32_MAX ) +{ + return UINT32_MAX; +} + else +return (uint32_t)value; +} + +uint32_t +mul (const unsigned long long x, const unsigned long long y) +{ + uint32_t value = truncate_int (x * y); + return value; +} + +uint32_t * +test(unsigned size, uint32_t *a, uint32_t s) +{ + unsigned i; + + for (i = 0; i size; i++) +{ + a[i] = mul (a[i], s); +} + + return a; +} -- 1.7.9.5