Re: [3/3][aarch64] Add support for vec_widen_shift pattern
Richard Biener writes: > On Fri, 13 Nov 2020, Joel Hutton wrote: > >> Tests are still running, but I believe I've addressed all the comments. >> >> > > +#include >> > > + >> > >> > SVE targets will need a: >> > >> > #pragma GCC target "+nosve" >> > >> > here, since we'll generate different code for SVE. >> >> Fixed. >> >> > > +/* { dg-final { scan-assembler-times "shll\t" 1} } */ >> > > +/* { dg-final { scan-assembler-times "shll2\t" 1} } */ >> > >> > Very minor nit, sorry, but I think: >> > >> > /* { dg-final { scan-assembler-times {\tshll\t} 1 } } */ >> > >> > would be better. Using "?\t" works, but IIRC it shows up as a tab >> > character in the testsuite result summary too. >> >> Fixed. Minor nits welcome. :) >> >> >> > OK for the aarch64 bits with the testsuite changes above. >> ok? > > The gcc/tree-vect-stmts.c parts are OK. Same for the AArch64 stuff. Thanks, Richard
Re: [3/3][aarch64] Add support for vec_widen_shift pattern
On Fri, 13 Nov 2020, Joel Hutton wrote: > Tests are still running, but I believe I've addressed all the comments. > > > > +#include > > > + > > > > SVE targets will need a: > > > > #pragma GCC target "+nosve" > > > > here, since we'll generate different code for SVE. > > Fixed. > > > > +/* { dg-final { scan-assembler-times "shll\t" 1} } */ > > > +/* { dg-final { scan-assembler-times "shll2\t" 1} } */ > > > > Very minor nit, sorry, but I think: > > > > /* { dg-final { scan-assembler-times {\tshll\t} 1 } } */ > > > > would be better. Using "?\t" works, but IIRC it shows up as a tab > > character in the testsuite result summary too. > > Fixed. Minor nits welcome. :) > > > > OK for the aarch64 bits with the testsuite changes above. > ok? The gcc/tree-vect-stmts.c parts are OK. Richard. > gcc/ChangeLog: > > 2020-11-13 Joel Hutton > > * config/aarch64/aarch64-simd.md: Add vec_widen_lshift_hi/lo > patterns. > * tree-vect-stmts.c > (vectorizable_conversion): Fix for widen_lshift case. > > gcc/testsuite/ChangeLog: > > 2020-11-13 Joel Hutton > > * gcc.target/aarch64/vect-widen-lshift.c: New test. > -- Richard Biener SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF: Felix Imend
Re: [3/3][aarch64] Add support for vec_widen_shift pattern
Tests are still running, but I believe I've addressed all the comments. > > +#include > > + > > SVE targets will need a: > > #pragma GCC target "+nosve" > > here, since we'll generate different code for SVE. Fixed. > > +/* { dg-final { scan-assembler-times "shll\t" 1} } */ > > +/* { dg-final { scan-assembler-times "shll2\t" 1} } */ > > Very minor nit, sorry, but I think: > > /* { dg-final { scan-assembler-times {\tshll\t} 1 } } */ > > would be better. Using "…\t" works, but IIRC it shows up as a tab > character in the testsuite result summary too. Fixed. Minor nits welcome. :) > OK for the aarch64 bits with the testsuite changes above. ok? gcc/ChangeLog: 2020-11-13 Joel Hutton * config/aarch64/aarch64-simd.md: Add vec_widen_lshift_hi/lo patterns. * tree-vect-stmts.c (vectorizable_conversion): Fix for widen_lshift case. gcc/testsuite/ChangeLog: 2020-11-13 Joel Hutton * gcc.target/aarch64/vect-widen-lshift.c: New test. From e8d3ed6fa739850eb649b97c250f1f2c650c34c1 Mon Sep 17 00:00:00 2001 From: Joel Hutton Date: Thu, 12 Nov 2020 11:48:25 + Subject: [PATCH 3/3] [AArch64][vect] vec_widen_lshift pattern Add aarch64 vec_widen_lshift_lo/hi patterns and fix bug it triggers in mid-end. This pattern takes one vector with N elements of size S, shifts each element left by the element width and stores the results as N elements of size 2*s (in 2 result vectors). The aarch64 backend implements this with the shll,shll2 instruction pair. --- gcc/config/aarch64/aarch64-simd.md| 66 +++ .../gcc.target/aarch64/vect-widen-lshift.c| 62 + gcc/tree-vect-stmts.c | 5 +- 3 files changed, 131 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/vect-widen-lshift.c diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 30299610635..4ba799a27c9 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -4664,8 +4664,74 @@ [(set_attr "type" "neon_sat_shift_reg")] ) +(define_expand "vec_widen_shiftl_lo_" + [(set (match_operand: 0 "register_operand" "=w") + (unspec: [(match_operand:VQW 1 "register_operand" "w") + (match_operand:SI 2 + "aarch64_simd_shift_imm_bitsize_" "i")] + VSHLL))] + "TARGET_SIMD" + { +rtx p = aarch64_simd_vect_par_cnst_half (mode, , false); +emit_insn (gen_aarch64_shll_internal (operands[0], operands[1], + p, operands[2])); +DONE; + } +) + +(define_expand "vec_widen_shiftl_hi_" + [(set (match_operand: 0 "register_operand") + (unspec: [(match_operand:VQW 1 "register_operand" "w") + (match_operand:SI 2 + "immediate_operand" "i")] + VSHLL))] + "TARGET_SIMD" + { +rtx p = aarch64_simd_vect_par_cnst_half (mode, , true); +emit_insn (gen_aarch64_shll2_internal (operands[0], operands[1], + p, operands[2])); +DONE; + } +) + ;; vshll_n +(define_insn "aarch64_shll_internal" + [(set (match_operand: 0 "register_operand" "=w") + (unspec: [(vec_select: + (match_operand:VQW 1 "register_operand" "w") + (match_operand:VQW 2 "vect_par_cnst_lo_half" "")) + (match_operand:SI 3 + "aarch64_simd_shift_imm_bitsize_" "i")] + VSHLL))] + "TARGET_SIMD" + { +if (INTVAL (operands[3]) == GET_MODE_UNIT_BITSIZE (mode)) + return "shll\\t%0., %1., %3"; +else + return "shll\\t%0., %1., %3"; + } + [(set_attr "type" "neon_shift_imm_long")] +) + +(define_insn "aarch64_shll2_internal" + [(set (match_operand: 0 "register_operand" "=w") + (unspec: [(vec_select: + (match_operand:VQW 1 "register_operand" "w") + (match_operand:VQW 2 "vect_par_cnst_hi_half" "")) + (match_operand:SI 3 + "aarch64_simd_shift_imm_bitsize_" "i")] + VSHLL))] + "TARGET_SIMD" + { +if (INTVAL (operands[3]) == GET_MODE_UNIT_BITSIZE (mode)) + return "shll2\\t%0., %1., %3"; +else + return "shll2\\t%0., %1., %3"; + } + [(set_attr "type" "neon_shift_imm_long")] +) + (define_insn "aarch64_shll_n" [(set (match_operand: 0 "register_operand" "=w") (unspec: [(match_operand:VD_BHSI 1 "register_operand" "w") diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-lshift.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-lshift.c new file mode 100644 index 000..48a3719d4ba --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-lshift.c @@ -0,0 +1,62 @@ +/* { dg-do run } */ +/* { dg-options "-O3 -save-temps" } */ +#include +#include + +#pragma GCC target "+nosve" + +#define ARR_SIZE 1024 + +/* Should produce an shll,shll2 pair*/ +void sshll_opt (int32_t *foo, int16_t *a, int16_t *b) +{ +for( int i = 0; i < ARR_SIZE - 3;i=i+4) +{ +foo[i] = a[i] << 16; +foo[i+1] = a[i+1] << 16; +foo[i+2] = a[i+2] << 16; +foo[i+3] = a[i+3] << 16; +} +} + +__attribute__((optimize (0))) +void sshll_nonopt (int32_t *foo, int16_t *a,
Re: [3/3][aarch64] Add support for vec_widen_shift pattern
Joel Hutton via Gcc-patches writes: > Hi all, > > This patch adds support in the aarch64 backend for the vec_widen_shift > vect-pattern and makes a minor mid-end fix to support it. > > All 3 patches together bootstrapped and regression tested on aarch64. > > Ok for stage 1? > > gcc/ChangeLog: > > 2020-11-12 Joel Hutton > > * config/aarch64/aarch64-simd.md: vec_widen_lshift_hi/lo > patterns > * tree-vect-stmts.c > (vectorizable_conversion): Fix for widen_lshift case > > gcc/testsuite/ChangeLog: > > 2020-11-12 Joel Hutton > > * gcc.target/aarch64/vect-widen-lshift.c: New test. > > From 97af35b2d2a505dcefd8474cbd4bc3441b83ab02 Mon Sep 17 00:00:00 2001 > From: Joel Hutton > Date: Thu, 12 Nov 2020 11:48:25 + > Subject: [PATCH 3/3] [AArch64][vect] vec_widen_lshift pattern > > Add aarch64 vec_widen_lshift_lo/hi patterns and fix bug it triggers in > mid-end. > --- > gcc/config/aarch64/aarch64-simd.md| 66 +++ > .../gcc.target/aarch64/vect-widen-lshift.c| 60 + > gcc/tree-vect-stmts.c | 9 ++- > 3 files changed, 133 insertions(+), 2 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/aarch64/vect-widen-lshift.c > > diff --git a/gcc/config/aarch64/aarch64-simd.md > b/gcc/config/aarch64/aarch64-simd.md > index > b4f56a2295926f027bd53e7456eec729af0cd6df..2bb39c530a1a861cb9bd3df0c2943f62bd6153d7 > 100644 > --- a/gcc/config/aarch64/aarch64-simd.md > +++ b/gcc/config/aarch64/aarch64-simd.md > @@ -4711,8 +4711,74 @@ >[(set_attr "type" "neon_sat_shift_reg")] > ) > > +(define_expand "vec_widen_shiftl_lo_" > + [(set (match_operand: 0 "register_operand" "=w") > + (unspec: [(match_operand:VQW 1 "register_operand" "w") > + (match_operand:SI 2 > +"aarch64_simd_shift_imm_bitsize_" "i")] > + VSHLL))] > + "TARGET_SIMD" > + { > +rtx p = aarch64_simd_vect_par_cnst_half (mode, , false); > +emit_insn (gen_aarch64_shll_internal (operands[0], > operands[1], > + p, operands[2])); > +DONE; > + } > +) > + > +(define_expand "vec_widen_shiftl_hi_" > + [(set (match_operand: 0 "register_operand") > + (unspec: [(match_operand:VQW 1 "register_operand" "w") > + (match_operand:SI 2 > +"immediate_operand" "i")] > + VSHLL))] > + "TARGET_SIMD" > + { > +rtx p = aarch64_simd_vect_par_cnst_half (mode, , true); > +emit_insn (gen_aarch64_shll2_internal (operands[0], > operands[1], > + p, operands[2])); > +DONE; > + } > +) > + > ;; vshll_n > > +(define_insn "aarch64_shll_internal" > + [(set (match_operand: 0 "register_operand" "=w") > + (unspec: [(vec_select: > + (match_operand:VQW 1 "register_operand" "w") > + (match_operand:VQW 2 "vect_par_cnst_lo_half" "")) > + (match_operand:SI 3 > +"aarch64_simd_shift_imm_bitsize_" "i")] > + VSHLL))] > + "TARGET_SIMD" > + { > +if (INTVAL (operands[3]) == GET_MODE_UNIT_BITSIZE (mode)) > + return "shll\\t%0., %1., %3"; > +else > + return "shll\\t%0., %1., %3"; > + } > + [(set_attr "type" "neon_shift_imm_long")] > +) > + > +(define_insn "aarch64_shll2_internal" > + [(set (match_operand: 0 "register_operand" "=w") > + (unspec: [(vec_select: > + (match_operand:VQW 1 "register_operand" "w") > + (match_operand:VQW 2 "vect_par_cnst_hi_half" "")) > + (match_operand:SI 3 > +"aarch64_simd_shift_imm_bitsize_" "i")] > + VSHLL))] > + "TARGET_SIMD" > + { > +if (INTVAL (operands[3]) == GET_MODE_UNIT_BITSIZE (mode)) > + return "shll2\\t%0., %1., %3"; > +else > + return "shll2\\t%0., %1., %3"; > + } > + [(set_attr "type" "neon_shift_imm_long")] > +) > + > (define_insn "aarch64_shll_n" >[(set (match_operand: 0 "register_operand" "=w") > (unspec: [(match_operand:VD_BHSI 1 "register_operand" "w") > diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-lshift.c > b/gcc/testsuite/gcc.target/aarch64/vect-widen-lshift.c > new file mode 100644 > index > ..23ed93d1dcbc3ca559efa6708b4ed5855fb6a050 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-lshift.c > @@ -0,0 +1,60 @@ > +/* { dg-do run } */ > +/* { dg-options "-O3 -save-temps" } */ > +#include > +#include > + SVE targets will need a: #pragma GCC target "+nosve" here, since we'll generate different code for SVE. > +#define ARR_SIZE 1024 > + > +/* Should produce an shll,shll2 pair*/ > +void sshll_opt (int32_t *foo, int16_t *a, int16_t *b) > +{ > +for( int i = 0; i < ARR_SIZE - 3;i=i+4) > +{ > +foo[i] = a[i]
Re: [3/3][aarch64] Add support for vec_widen_shift pattern
On Thu, 12 Nov 2020, Joel Hutton wrote: > Hi all, > > This patch adds support in the aarch64 backend for the vec_widen_shift > vect-pattern and makes a minor mid-end fix to support it. > > All 3 patches together bootstrapped and regression tested on aarch64. > > Ok for stage 1? diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c index f12fd158b13656ee24022ec7e445c53444be6554..1f40b59c0560eec675af1d9a0e3e818d47 589de6 100644 --- a/gcc/tree-vect-stmts.c +++ b/gcc/tree-vect-stmts.c @@ -4934,8 +4934,13 @@ vectorizable_conversion (vec_info *vinfo, _oprnds1); if (code == WIDEN_LSHIFT_EXPR) { - vec_oprnds1.create (ncopies * ninputs); - for (i = 0; i < ncopies * ninputs; ++i) + int oprnds_size = ncopies * ninputs; + /* In the case of SLP ncopies = 1, so the size of vec_oprnds1 here + * should be obtained by the the size of vec_oprnds0. */ You should be able to always use vec_oprnds0.length () This hunk is OK with that change. + if (slp_node) + oprnds_size = vec_oprnds0.length (); + vec_oprnds1.create (oprnds_size); + for (i = 0; i < oprnds_size; ++i) vec_oprnds1.quick_push (op1); } /* Arguments are ready. Create the new vector stmts. */ > > gcc/ChangeLog: > > 2020-11-12 ?Joel Hutton ? > > ? ? ? ? * config/aarch64/aarch64-simd.md: vec_widen_lshift_hi/lo > patterns > ? ? ? ? * tree-vect-stmts.c > ? ? ? ? (vectorizable_conversion): Fix for widen_lshift case > > gcc/testsuite/ChangeLog: > > 2020-11-12 ?Joel Hutton ? > > ? ? ? ? * gcc.target/aarch64/vect-widen-lshift.c: New test. > -- Richard Biener SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF: Felix Imend
[3/3][aarch64] Add support for vec_widen_shift pattern
Hi all, This patch adds support in the aarch64 backend for the vec_widen_shift vect-pattern and makes a minor mid-end fix to support it. All 3 patches together bootstrapped and regression tested on aarch64. Ok for stage 1? gcc/ChangeLog: 2020-11-12 Joel Hutton * config/aarch64/aarch64-simd.md: vec_widen_lshift_hi/lo patterns * tree-vect-stmts.c (vectorizable_conversion): Fix for widen_lshift case gcc/testsuite/ChangeLog: 2020-11-12 Joel Hutton * gcc.target/aarch64/vect-widen-lshift.c: New test. From 97af35b2d2a505dcefd8474cbd4bc3441b83ab02 Mon Sep 17 00:00:00 2001 From: Joel Hutton Date: Thu, 12 Nov 2020 11:48:25 + Subject: [PATCH 3/3] [AArch64][vect] vec_widen_lshift pattern Add aarch64 vec_widen_lshift_lo/hi patterns and fix bug it triggers in mid-end. --- gcc/config/aarch64/aarch64-simd.md| 66 +++ .../gcc.target/aarch64/vect-widen-lshift.c| 60 + gcc/tree-vect-stmts.c | 9 ++- 3 files changed, 133 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/vect-widen-lshift.c diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index b4f56a2295926f027bd53e7456eec729af0cd6df..2bb39c530a1a861cb9bd3df0c2943f62bd6153d7 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -4711,8 +4711,74 @@ [(set_attr "type" "neon_sat_shift_reg")] ) +(define_expand "vec_widen_shiftl_lo_" + [(set (match_operand: 0 "register_operand" "=w") + (unspec: [(match_operand:VQW 1 "register_operand" "w") + (match_operand:SI 2 + "aarch64_simd_shift_imm_bitsize_" "i")] + VSHLL))] + "TARGET_SIMD" + { +rtx p = aarch64_simd_vect_par_cnst_half (mode, , false); +emit_insn (gen_aarch64_shll_internal (operands[0], operands[1], + p, operands[2])); +DONE; + } +) + +(define_expand "vec_widen_shiftl_hi_" + [(set (match_operand: 0 "register_operand") + (unspec: [(match_operand:VQW 1 "register_operand" "w") + (match_operand:SI 2 + "immediate_operand" "i")] + VSHLL))] + "TARGET_SIMD" + { +rtx p = aarch64_simd_vect_par_cnst_half (mode, , true); +emit_insn (gen_aarch64_shll2_internal (operands[0], operands[1], + p, operands[2])); +DONE; + } +) + ;; vshll_n +(define_insn "aarch64_shll_internal" + [(set (match_operand: 0 "register_operand" "=w") + (unspec: [(vec_select: + (match_operand:VQW 1 "register_operand" "w") + (match_operand:VQW 2 "vect_par_cnst_lo_half" "")) + (match_operand:SI 3 + "aarch64_simd_shift_imm_bitsize_" "i")] + VSHLL))] + "TARGET_SIMD" + { +if (INTVAL (operands[3]) == GET_MODE_UNIT_BITSIZE (mode)) + return "shll\\t%0., %1., %3"; +else + return "shll\\t%0., %1., %3"; + } + [(set_attr "type" "neon_shift_imm_long")] +) + +(define_insn "aarch64_shll2_internal" + [(set (match_operand: 0 "register_operand" "=w") + (unspec: [(vec_select: + (match_operand:VQW 1 "register_operand" "w") + (match_operand:VQW 2 "vect_par_cnst_hi_half" "")) + (match_operand:SI 3 + "aarch64_simd_shift_imm_bitsize_" "i")] + VSHLL))] + "TARGET_SIMD" + { +if (INTVAL (operands[3]) == GET_MODE_UNIT_BITSIZE (mode)) + return "shll2\\t%0., %1., %3"; +else + return "shll2\\t%0., %1., %3"; + } + [(set_attr "type" "neon_shift_imm_long")] +) + (define_insn "aarch64_shll_n" [(set (match_operand: 0 "register_operand" "=w") (unspec: [(match_operand:VD_BHSI 1 "register_operand" "w") diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-lshift.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-lshift.c new file mode 100644 index ..23ed93d1dcbc3ca559efa6708b4ed5855fb6a050 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-lshift.c @@ -0,0 +1,60 @@ +/* { dg-do run } */ +/* { dg-options "-O3 -save-temps" } */ +#include +#include + +#define ARR_SIZE 1024 + +/* Should produce an shll,shll2 pair*/ +void sshll_opt (int32_t *foo, int16_t *a, int16_t *b) +{ +for( int i = 0; i < ARR_SIZE - 3;i=i+4) +{ +foo[i] = a[i] << 16; +foo[i+1] = a[i+1] << 16; +foo[i+2] = a[i+2] << 16; +foo[i+3] = a[i+3] << 16; +} +} + +__attribute__((optimize (0))) +void sshll_nonopt (int32_t *foo, int16_t *a, int16_t *b) +{ +for( int i = 0; i < ARR_SIZE - 3;i=i+4) +{ +foo[i] = a[i] << 16; +foo[i+1] = a[i+1] << 16; +foo[i+2] = a[i+2] << 16; +foo[i+3] = a[i+3] << 16; +} +} + + +void __attribute__((optimize (0))) +init(uint16_t *a, uint16_t *b) +{ +for( int i = 0; i < ARR_SIZE;i++) +{ + a[i] = i; + b[i] = 2*i; +} +} + +int __attribute__((optimize (0))) +main() +{ +uint32_t foo_arr[ARR_SIZE]; +uint32_t bar_arr[ARR_SIZE]; +uint16_t a[ARR_SIZE]; +uint16_t b[ARR_SIZE]; + +init(a, b); +sshll_opt(foo_arr, a, b); +