Re: [3/3][aarch64] Add support for vec_widen_shift pattern

2020-11-17 Thread Richard Sandiford via Gcc-patches
Richard Biener  writes:
> On Fri, 13 Nov 2020, Joel Hutton wrote:
>
>> Tests are still running, but I believe I've addressed all the comments.
>> 
>> > > +#include 
>> > > +
>> > 
>> > SVE targets will need a:
>> > 
>> > #pragma GCC target "+nosve"
>> > 
>> > here, since we'll generate different code for SVE.
>> 
>> Fixed.
>> 
>> > > +/* { dg-final { scan-assembler-times "shll\t" 1} } */
>> > > +/* { dg-final { scan-assembler-times "shll2\t" 1} } */
>> > 
>> > Very minor nit, sorry, but I think:
>> > 
>> > /* { dg-final { scan-assembler-times {\tshll\t} 1 } } */
>> > 
>> > would be better.  Using "?\t" works, but IIRC it shows up as a tab
>> > character in the testsuite result summary too.
>> 
>> Fixed. Minor nits welcome. :)
>> 
>> 
>> > OK for the aarch64 bits with the testsuite changes above.
>> ok?
>
> The gcc/tree-vect-stmts.c parts are OK.

Same for the AArch64 stuff.

Thanks,
Richard


Re: [3/3][aarch64] Add support for vec_widen_shift pattern

2020-11-16 Thread Richard Biener
On Fri, 13 Nov 2020, Joel Hutton wrote:

> Tests are still running, but I believe I've addressed all the comments.
> 
> > > +#include 
> > > +
> > 
> > SVE targets will need a:
> > 
> > #pragma GCC target "+nosve"
> > 
> > here, since we'll generate different code for SVE.
> 
> Fixed.
> 
> > > +/* { dg-final { scan-assembler-times "shll\t" 1} } */
> > > +/* { dg-final { scan-assembler-times "shll2\t" 1} } */
> > 
> > Very minor nit, sorry, but I think:
> > 
> > /* { dg-final { scan-assembler-times {\tshll\t} 1 } } */
> > 
> > would be better.  Using "?\t" works, but IIRC it shows up as a tab
> > character in the testsuite result summary too.
> 
> Fixed. Minor nits welcome. :)
> 
> 
> > OK for the aarch64 bits with the testsuite changes above.
> ok?

The gcc/tree-vect-stmts.c parts are OK.

Richard.

> gcc/ChangeLog:
> 
> 2020-11-13  Joel Hutton  
> 
> * config/aarch64/aarch64-simd.md: Add vec_widen_lshift_hi/lo
> patterns.
> * tree-vect-stmts.c
> (vectorizable_conversion): Fix for widen_lshift case.
> 
> gcc/testsuite/ChangeLog:
> 
> 2020-11-13  Joel Hutton  
> 
> * gcc.target/aarch64/vect-widen-lshift.c: New test.
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imend


Re: [3/3][aarch64] Add support for vec_widen_shift pattern

2020-11-13 Thread Joel Hutton via Gcc-patches
Tests are still running, but I believe I've addressed all the comments.

> > +#include 
> > +
> 
> SVE targets will need a:
> 
> #pragma GCC target "+nosve"
> 
> here, since we'll generate different code for SVE.

Fixed.

> > +/* { dg-final { scan-assembler-times "shll\t" 1} } */
> > +/* { dg-final { scan-assembler-times "shll2\t" 1} } */
> 
> Very minor nit, sorry, but I think:
> 
> /* { dg-final { scan-assembler-times {\tshll\t} 1 } } */
> 
> would be better.  Using "…\t" works, but IIRC it shows up as a tab
> character in the testsuite result summary too.

Fixed. Minor nits welcome. :)


> OK for the aarch64 bits with the testsuite changes above.
ok?

gcc/ChangeLog:

2020-11-13  Joel Hutton  

* config/aarch64/aarch64-simd.md: Add vec_widen_lshift_hi/lo
  patterns.
* tree-vect-stmts.c
(vectorizable_conversion): Fix for widen_lshift case.

gcc/testsuite/ChangeLog:

2020-11-13  Joel Hutton  

* gcc.target/aarch64/vect-widen-lshift.c: New test.
From e8d3ed6fa739850eb649b97c250f1f2c650c34c1 Mon Sep 17 00:00:00 2001
From: Joel Hutton 
Date: Thu, 12 Nov 2020 11:48:25 +
Subject: [PATCH 3/3] [AArch64][vect] vec_widen_lshift pattern

Add aarch64 vec_widen_lshift_lo/hi patterns and fix bug it triggers in
mid-end. This pattern takes one vector with N elements of size S, shifts
each element left by the element width and stores the results as N
elements of size 2*s (in 2 result vectors). The aarch64 backend
implements this with the shll,shll2 instruction pair.
---
 gcc/config/aarch64/aarch64-simd.md| 66 +++
 .../gcc.target/aarch64/vect-widen-lshift.c| 62 +
 gcc/tree-vect-stmts.c |  5 +-
 3 files changed, 131 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/vect-widen-lshift.c

diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 30299610635..4ba799a27c9 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -4664,8 +4664,74 @@
   [(set_attr "type" "neon_sat_shift_reg")]
 )
 
+(define_expand "vec_widen_shiftl_lo_"
+  [(set (match_operand: 0 "register_operand" "=w")
+	(unspec: [(match_operand:VQW 1 "register_operand" "w")
+			 (match_operand:SI 2
+			   "aarch64_simd_shift_imm_bitsize_" "i")]
+			 VSHLL))]
+  "TARGET_SIMD"
+  {
+rtx p = aarch64_simd_vect_par_cnst_half (mode, , false);
+emit_insn (gen_aarch64_shll_internal (operands[0], operands[1],
+		 p, operands[2]));
+DONE;
+  }
+)
+
+(define_expand "vec_widen_shiftl_hi_"
+   [(set (match_operand: 0 "register_operand")
+	(unspec: [(match_operand:VQW 1 "register_operand" "w")
+			 (match_operand:SI 2
+			   "immediate_operand" "i")]
+			  VSHLL))]
+   "TARGET_SIMD"
+   {
+rtx p = aarch64_simd_vect_par_cnst_half (mode, , true);
+emit_insn (gen_aarch64_shll2_internal (operands[0], operands[1],
+		  p, operands[2]));
+DONE;
+   }
+)
+
 ;; vshll_n
 
+(define_insn "aarch64_shll_internal"
+  [(set (match_operand: 0 "register_operand" "=w")
+	(unspec: [(vec_select:
+			(match_operand:VQW 1 "register_operand" "w")
+			(match_operand:VQW 2 "vect_par_cnst_lo_half" ""))
+			 (match_operand:SI 3
+			   "aarch64_simd_shift_imm_bitsize_" "i")]
+			 VSHLL))]
+  "TARGET_SIMD"
+  {
+if (INTVAL (operands[3]) == GET_MODE_UNIT_BITSIZE (mode))
+  return "shll\\t%0., %1., %3";
+else
+  return "shll\\t%0., %1., %3";
+  }
+  [(set_attr "type" "neon_shift_imm_long")]
+)
+
+(define_insn "aarch64_shll2_internal"
+  [(set (match_operand: 0 "register_operand" "=w")
+	(unspec: [(vec_select:
+			(match_operand:VQW 1 "register_operand" "w")
+			(match_operand:VQW 2 "vect_par_cnst_hi_half" ""))
+			 (match_operand:SI 3
+			   "aarch64_simd_shift_imm_bitsize_" "i")]
+			 VSHLL))]
+  "TARGET_SIMD"
+  {
+if (INTVAL (operands[3]) == GET_MODE_UNIT_BITSIZE (mode))
+  return "shll2\\t%0., %1., %3";
+else
+  return "shll2\\t%0., %1., %3";
+  }
+  [(set_attr "type" "neon_shift_imm_long")]
+)
+
 (define_insn "aarch64_shll_n"
   [(set (match_operand: 0 "register_operand" "=w")
 	(unspec: [(match_operand:VD_BHSI 1 "register_operand" "w")
diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-lshift.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-lshift.c
new file mode 100644
index 000..48a3719d4ba
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-lshift.c
@@ -0,0 +1,62 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -save-temps" } */
+#include 
+#include 
+
+#pragma GCC target "+nosve"
+
+#define ARR_SIZE 1024
+
+/* Should produce an shll,shll2 pair*/
+void sshll_opt (int32_t *foo, int16_t *a, int16_t *b)
+{
+for( int i = 0; i < ARR_SIZE - 3;i=i+4)
+{
+foo[i]   = a[i]   << 16;
+foo[i+1] = a[i+1] << 16;
+foo[i+2] = a[i+2] << 16;
+foo[i+3] = a[i+3] << 16;
+}
+}
+
+__attribute__((optimize (0)))
+void sshll_nonopt (int32_t *foo, int16_t *a, 

Re: [3/3][aarch64] Add support for vec_widen_shift pattern

2020-11-13 Thread Richard Sandiford via Gcc-patches
Joel Hutton via Gcc-patches  writes:
> Hi all,
>
> This patch adds support in the aarch64 backend for the vec_widen_shift 
> vect-pattern and makes a minor mid-end fix to support it.
>
> All 3 patches together bootstrapped and regression tested on aarch64.
>
> Ok for stage 1?
>
> gcc/ChangeLog:
>
> 2020-11-12  Joel Hutton  
>
>         * config/aarch64/aarch64-simd.md: vec_widen_lshift_hi/lo 
> patterns
>         * tree-vect-stmts.c 
>         (vectorizable_conversion): Fix for widen_lshift case
>
> gcc/testsuite/ChangeLog:
>
> 2020-11-12  Joel Hutton  
>
>         * gcc.target/aarch64/vect-widen-lshift.c: New test.
>
> From 97af35b2d2a505dcefd8474cbd4bc3441b83ab02 Mon Sep 17 00:00:00 2001
> From: Joel Hutton 
> Date: Thu, 12 Nov 2020 11:48:25 +
> Subject: [PATCH 3/3] [AArch64][vect] vec_widen_lshift pattern
>
> Add aarch64 vec_widen_lshift_lo/hi patterns and fix bug it triggers in
> mid-end.
> ---
>  gcc/config/aarch64/aarch64-simd.md| 66 +++
>  .../gcc.target/aarch64/vect-widen-lshift.c| 60 +
>  gcc/tree-vect-stmts.c |  9 ++-
>  3 files changed, 133 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/vect-widen-lshift.c
>
> diff --git a/gcc/config/aarch64/aarch64-simd.md 
> b/gcc/config/aarch64/aarch64-simd.md
> index 
> b4f56a2295926f027bd53e7456eec729af0cd6df..2bb39c530a1a861cb9bd3df0c2943f62bd6153d7
>  100644
> --- a/gcc/config/aarch64/aarch64-simd.md
> +++ b/gcc/config/aarch64/aarch64-simd.md
> @@ -4711,8 +4711,74 @@
>[(set_attr "type" "neon_sat_shift_reg")]
>  )
>  
> +(define_expand "vec_widen_shiftl_lo_"
> +  [(set (match_operand: 0 "register_operand" "=w")
> + (unspec: [(match_operand:VQW 1 "register_operand" "w")
> +  (match_operand:SI 2
> +"aarch64_simd_shift_imm_bitsize_" "i")]
> +  VSHLL))]
> +  "TARGET_SIMD"
> +  {
> +rtx p = aarch64_simd_vect_par_cnst_half (mode, , false);
> +emit_insn (gen_aarch64_shll_internal (operands[0], 
> operands[1],
> +  p, operands[2]));
> +DONE;
> +  }
> +)
> +
> +(define_expand "vec_widen_shiftl_hi_"
> +   [(set (match_operand: 0 "register_operand")
> + (unspec: [(match_operand:VQW 1 "register_operand" "w")
> +  (match_operand:SI 2
> +"immediate_operand" "i")]
> +   VSHLL))]
> +   "TARGET_SIMD"
> +   {
> +rtx p = aarch64_simd_vect_par_cnst_half (mode, , true);
> +emit_insn (gen_aarch64_shll2_internal (operands[0], 
> operands[1],
> +   p, operands[2]));
> +DONE;
> +   }
> +)
> +
>  ;; vshll_n
>  
> +(define_insn "aarch64_shll_internal"
> +  [(set (match_operand: 0 "register_operand" "=w")
> + (unspec: [(vec_select:
> + (match_operand:VQW 1 "register_operand" "w")
> + (match_operand:VQW 2 "vect_par_cnst_lo_half" ""))
> +  (match_operand:SI 3
> +"aarch64_simd_shift_imm_bitsize_" "i")]
> +  VSHLL))]
> +  "TARGET_SIMD"
> +  {
> +if (INTVAL (operands[3]) == GET_MODE_UNIT_BITSIZE (mode))
> +  return "shll\\t%0., %1., %3";
> +else
> +  return "shll\\t%0., %1., %3";
> +  }
> +  [(set_attr "type" "neon_shift_imm_long")]
> +)
> +
> +(define_insn "aarch64_shll2_internal"
> +  [(set (match_operand: 0 "register_operand" "=w")
> + (unspec: [(vec_select:
> + (match_operand:VQW 1 "register_operand" "w")
> + (match_operand:VQW 2 "vect_par_cnst_hi_half" ""))
> +  (match_operand:SI 3
> +"aarch64_simd_shift_imm_bitsize_" "i")]
> +  VSHLL))]
> +  "TARGET_SIMD"
> +  {
> +if (INTVAL (operands[3]) == GET_MODE_UNIT_BITSIZE (mode))
> +  return "shll2\\t%0., %1., %3";
> +else
> +  return "shll2\\t%0., %1., %3";
> +  }
> +  [(set_attr "type" "neon_shift_imm_long")]
> +)
> +
>  (define_insn "aarch64_shll_n"
>[(set (match_operand: 0 "register_operand" "=w")
>   (unspec: [(match_operand:VD_BHSI 1 "register_operand" "w")
> diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-lshift.c 
> b/gcc/testsuite/gcc.target/aarch64/vect-widen-lshift.c
> new file mode 100644
> index 
> ..23ed93d1dcbc3ca559efa6708b4ed5855fb6a050
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-lshift.c
> @@ -0,0 +1,60 @@
> +/* { dg-do run } */
> +/* { dg-options "-O3 -save-temps" } */
> +#include 
> +#include 
> +

SVE targets will need a:

#pragma GCC target "+nosve"

here, since we'll generate different code for SVE.

> +#define ARR_SIZE 1024
> +
> +/* Should produce an shll,shll2 pair*/
> +void sshll_opt (int32_t *foo, int16_t *a, int16_t *b)
> +{
> +for( int i = 0; i < ARR_SIZE - 3;i=i+4)
> +{
> +foo[i]   = a[i] 

Re: [3/3][aarch64] Add support for vec_widen_shift pattern

2020-11-13 Thread Richard Biener
On Thu, 12 Nov 2020, Joel Hutton wrote:

> Hi all,
> 
> This patch adds support in the aarch64 backend for the vec_widen_shift 
> vect-pattern and makes a minor mid-end fix to support it.
> 
> All 3 patches together bootstrapped and regression tested on aarch64.
> 
> Ok for stage 1?

diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index
f12fd158b13656ee24022ec7e445c53444be6554..1f40b59c0560eec675af1d9a0e3e818d47
589de6 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -4934,8 +4934,13 @@ vectorizable_conversion (vec_info *vinfo,
 _oprnds1);
   if (code == WIDEN_LSHIFT_EXPR)
{
- vec_oprnds1.create (ncopies * ninputs);
- for (i = 0; i < ncopies * ninputs; ++i)
+ int oprnds_size = ncopies * ninputs;
+ /* In the case of SLP ncopies = 1, so the size of vec_oprnds1 
here
+  * should be obtained by the the size of vec_oprnds0.  */

You should be able to always use vec_oprnds0.length ()

This hunk is OK with that change.

+ if (slp_node)
+   oprnds_size = vec_oprnds0.length ();
+ vec_oprnds1.create (oprnds_size);
+ for (i = 0; i < oprnds_size; ++i)
vec_oprnds1.quick_push (op1);
}
   /* Arguments are ready.  Create the new vector stmts.  */

> 
> gcc/ChangeLog:
> 
> 2020-11-12 ?Joel Hutton ?
> 
> ? ? ? ? * config/aarch64/aarch64-simd.md: vec_widen_lshift_hi/lo 
> patterns
> ? ? ? ? * tree-vect-stmts.c 
> ? ? ? ? (vectorizable_conversion): Fix for widen_lshift case
> 
> gcc/testsuite/ChangeLog:
> 
> 2020-11-12 ?Joel Hutton ?
> 
> ? ? ? ? * gcc.target/aarch64/vect-widen-lshift.c: New test.
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imend


[3/3][aarch64] Add support for vec_widen_shift pattern

2020-11-12 Thread Joel Hutton via Gcc-patches
Hi all,

This patch adds support in the aarch64 backend for the vec_widen_shift 
vect-pattern and makes a minor mid-end fix to support it.

All 3 patches together bootstrapped and regression tested on aarch64.

Ok for stage 1?

gcc/ChangeLog:

2020-11-12  Joel Hutton  

        * config/aarch64/aarch64-simd.md: vec_widen_lshift_hi/lo patterns
        * tree-vect-stmts.c 
        (vectorizable_conversion): Fix for widen_lshift case

gcc/testsuite/ChangeLog:

2020-11-12  Joel Hutton  

        * gcc.target/aarch64/vect-widen-lshift.c: New test.
From 97af35b2d2a505dcefd8474cbd4bc3441b83ab02 Mon Sep 17 00:00:00 2001
From: Joel Hutton 
Date: Thu, 12 Nov 2020 11:48:25 +
Subject: [PATCH 3/3] [AArch64][vect] vec_widen_lshift pattern

Add aarch64 vec_widen_lshift_lo/hi patterns and fix bug it triggers in
mid-end.
---
 gcc/config/aarch64/aarch64-simd.md| 66 +++
 .../gcc.target/aarch64/vect-widen-lshift.c| 60 +
 gcc/tree-vect-stmts.c |  9 ++-
 3 files changed, 133 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/vect-widen-lshift.c

diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index b4f56a2295926f027bd53e7456eec729af0cd6df..2bb39c530a1a861cb9bd3df0c2943f62bd6153d7 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -4711,8 +4711,74 @@
   [(set_attr "type" "neon_sat_shift_reg")]
 )
 
+(define_expand "vec_widen_shiftl_lo_"
+  [(set (match_operand: 0 "register_operand" "=w")
+	(unspec: [(match_operand:VQW 1 "register_operand" "w")
+			 (match_operand:SI 2
+			   "aarch64_simd_shift_imm_bitsize_" "i")]
+			 VSHLL))]
+  "TARGET_SIMD"
+  {
+rtx p = aarch64_simd_vect_par_cnst_half (mode, , false);
+emit_insn (gen_aarch64_shll_internal (operands[0], operands[1],
+		 p, operands[2]));
+DONE;
+  }
+)
+
+(define_expand "vec_widen_shiftl_hi_"
+   [(set (match_operand: 0 "register_operand")
+	(unspec: [(match_operand:VQW 1 "register_operand" "w")
+			 (match_operand:SI 2
+			   "immediate_operand" "i")]
+			  VSHLL))]
+   "TARGET_SIMD"
+   {
+rtx p = aarch64_simd_vect_par_cnst_half (mode, , true);
+emit_insn (gen_aarch64_shll2_internal (operands[0], operands[1],
+		  p, operands[2]));
+DONE;
+   }
+)
+
 ;; vshll_n
 
+(define_insn "aarch64_shll_internal"
+  [(set (match_operand: 0 "register_operand" "=w")
+	(unspec: [(vec_select:
+			(match_operand:VQW 1 "register_operand" "w")
+			(match_operand:VQW 2 "vect_par_cnst_lo_half" ""))
+			 (match_operand:SI 3
+			   "aarch64_simd_shift_imm_bitsize_" "i")]
+			 VSHLL))]
+  "TARGET_SIMD"
+  {
+if (INTVAL (operands[3]) == GET_MODE_UNIT_BITSIZE (mode))
+  return "shll\\t%0., %1., %3";
+else
+  return "shll\\t%0., %1., %3";
+  }
+  [(set_attr "type" "neon_shift_imm_long")]
+)
+
+(define_insn "aarch64_shll2_internal"
+  [(set (match_operand: 0 "register_operand" "=w")
+	(unspec: [(vec_select:
+			(match_operand:VQW 1 "register_operand" "w")
+			(match_operand:VQW 2 "vect_par_cnst_hi_half" ""))
+			 (match_operand:SI 3
+			   "aarch64_simd_shift_imm_bitsize_" "i")]
+			 VSHLL))]
+  "TARGET_SIMD"
+  {
+if (INTVAL (operands[3]) == GET_MODE_UNIT_BITSIZE (mode))
+  return "shll2\\t%0., %1., %3";
+else
+  return "shll2\\t%0., %1., %3";
+  }
+  [(set_attr "type" "neon_shift_imm_long")]
+)
+
 (define_insn "aarch64_shll_n"
   [(set (match_operand: 0 "register_operand" "=w")
 	(unspec: [(match_operand:VD_BHSI 1 "register_operand" "w")
diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-lshift.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-lshift.c
new file mode 100644
index ..23ed93d1dcbc3ca559efa6708b4ed5855fb6a050
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-lshift.c
@@ -0,0 +1,60 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -save-temps" } */
+#include 
+#include 
+
+#define ARR_SIZE 1024
+
+/* Should produce an shll,shll2 pair*/
+void sshll_opt (int32_t *foo, int16_t *a, int16_t *b)
+{
+for( int i = 0; i < ARR_SIZE - 3;i=i+4)
+{
+foo[i]   = a[i]   << 16;
+foo[i+1] = a[i+1] << 16;
+foo[i+2] = a[i+2] << 16;
+foo[i+3] = a[i+3] << 16;
+}
+}
+
+__attribute__((optimize (0)))
+void sshll_nonopt (int32_t *foo, int16_t *a, int16_t *b)
+{
+for( int i = 0; i < ARR_SIZE - 3;i=i+4)
+{
+foo[i]   = a[i]   << 16;
+foo[i+1] = a[i+1] << 16;
+foo[i+2] = a[i+2] << 16;
+foo[i+3] = a[i+3] << 16;
+}
+}
+
+
+void __attribute__((optimize (0)))
+init(uint16_t *a, uint16_t *b)
+{
+for( int i = 0; i < ARR_SIZE;i++)
+{
+  a[i] = i;
+  b[i] = 2*i;
+}
+}
+
+int __attribute__((optimize (0)))
+main()
+{
+uint32_t foo_arr[ARR_SIZE];
+uint32_t bar_arr[ARR_SIZE];
+uint16_t a[ARR_SIZE];
+uint16_t b[ARR_SIZE];
+
+init(a, b);
+sshll_opt(foo_arr, a, b);
+