subject:"\[PATCH AArch64\] Prefer dup to zip for vec_perm_const; enable dup for bigendian; add testcase."

Re: [PATCH AArch64] Prefer dup to zip for vec_perm_const; enable dup for bigendian; add testcase.

2014-08-05 Thread Richard Earnshaw

On 04/08/14 14:32, Alan Lawrence wrote:
> At the moment, for two-element vectors, __builtin_shuffle (vector, (mask) {C, 
> C}) for identical constants C outputs a zip (with both argument vectors the 
> same) rather than a dup. Dup is more obvious and easier to read, so prefer it.
> 
> For big-endian, aarch64_evpc_dup always aborts; however tests demonstrate it 
> works ok, so enable it.
> 
> Finally, add a testcase (of execution results, this gives confidence that 
> evpc_dup is ok for bigendian - yes, a different element index is output than 
> for 
> little-endian). Note existing tests for zip are not affected, they always 
> have 
> the two arguments different.
> 
> gcc/ChangeLog:
>   * config/aarch64/aarch64.c (aarch64_evpc_dup): Enable for bigendian.
>   (aarch64_expand_vec_perm_const): Check for dup before zip.
> 
> gcc/testsuite/ChangeLog:
>   * gcc.target/aarch64/vdup_n_2.c: New test.
> 
> 

OK.

R.

[PATCH AArch64] Prefer dup to zip for vec_perm_const; enable dup for bigendian; add testcase.

2014-08-04 Thread Alan Lawrence

At the moment, for two-element vectors, __builtin_shuffle (vector, (mask) {C, 
C}) for identical constants C outputs a zip (with both argument vectors the 
same) rather than a dup. Dup is more obvious and easier to read, so prefer it.


For big-endian, aarch64_evpc_dup always aborts; however tests demonstrate it 
works ok, so enable it.


Finally, add a testcase (of execution results, this gives confidence that 
evpc_dup is ok for bigendian - yes, a different element index is output than for 
little-endian). Note existing tests for zip are not affected, they always have 
the two arguments different.


gcc/ChangeLog:
* config/aarch64/aarch64.c (aarch64_evpc_dup): Enable for bigendian.
(aarch64_expand_vec_perm_const): Check for dup before zip.

gcc/testsuite/ChangeLog:
* gcc.target/aarch64/vdup_n_2.c: New test.diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 4c65bb1dbc190165eee9dd2d9b54779ac4a362fa..153b1c3d282cbfb4872d2b267e763c9ec0ddeb90 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -9157,10 +9157,6 @@ aarch64_evpc_dup (struct expand_vec_perm_d *d)
   unsigned int i, elt, nelt = d->nelt;
   rtx lane;
 
-  /* TODO: This may not be big-endian safe.  */
-  if (BYTES_BIG_ENDIAN)
-return false;
-
   elt = d->perm[0];
   for (i = 1; i < nelt; i++)
 {
@@ -9174,7 +9170,7 @@ aarch64_evpc_dup (struct expand_vec_perm_d *d)
  use d->op0 and need not do any extra arithmetic to get the
  correct lane number.  */
   in0 = d->op0;
-  lane = GEN_INT (elt);
+  lane = GEN_INT (elt); /* The pattern corrects for big-endian.  */
 
   switch (vmode)
 {
@@ -9255,14 +9251,14 @@ aarch64_expand_vec_perm_const_1 (struct expand_vec_perm_d *d)
 	return true;
   else if (aarch64_evpc_ext (d))
 	return true;
+  else if (aarch64_evpc_dup (d))
+	return true;
   else if (aarch64_evpc_zip (d))
 	return true;
   else if (aarch64_evpc_uzp (d))
 	return true;
   else if (aarch64_evpc_trn (d))
 	return true;
-  else if (aarch64_evpc_dup (d))
-	return true;
   return aarch64_evpc_tbl (d);
 }
   return false;
diff --git a/gcc/testsuite/gcc.target/aarch64/vdup_n_2.c b/gcc/testsuite/gcc.target/aarch64/vdup_n_2.c
new file mode 100644
index ..660fb0faeabcc632ae3edb1fb8fa9b96d57a4923
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/vdup_n_2.c
@@ -0,0 +1,28 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -fno-inline --save-temps" } */
+
+extern void abort (void);
+
+typedef float float32x2_t __attribute__ ((__vector_size__ ((8;
+typedef unsigned int uint32x2_t __attribute__ ((__vector_size__ ((8;
+
+float32x2_t
+test_dup_1 (float32x2_t in)
+{
+  return __builtin_shuffle (in, (uint32x2_t) {1, 1});
+}
+
+int
+main (int argc, char **argv)
+{
+  float32x2_t test = {2.718, 3.141};
+  float32x2_t res = test_dup_1 (test);
+  if (res[0] != test[1] || res[1] != test[1])
+abort ();
+  return 0;
+}
+
+/* { dg-final { scan-assembler-times "\[ \t\]*dup\[ \t\]+v\[0-9\]+\.2s, ?v\[0-9\]+\.s\\\[\[01\]\\\]" 1 } } */
+/* { dg-final { scan-assembler-not "zip" } } */
+/* { dg-final { cleanup-saved-temps } } */
+

Re: [PATCH AArch64] Prefer dup to zip for vec_perm_const; enable dup for bigendian; add testcase.

[PATCH AArch64] Prefer dup to zip for vec_perm_const; enable dup for bigendian; add testcase.

2 matches

Site Navigation

Mail list logo

Footer information