Hi!

The pr54346.c testcase FAILs on i686-linux (without -msse*) for multiple
reasons.  One is the trivial missing -Wno-psabi which the following patch
adds, but that isn't enough.  The thing is that without native vector
support, we have VEC_PERM_EXPRs in the IL and are actually considering
the nested VEC_PERM_EXPRs into one VEC_PERM_EXPR optimization, but punt
because can_vec_perm_const_p (result_mode, op_mode, sel2, false) is false.

Such a test makes sense to prevent "optimizing" two VEC_PERM_EXPRs
that can be handled by the backend natively into one VEC_PERM_EXPR
that can't be handled.  But if both of the original VEC_PERM_EXPRs
can't be handled natively either, having just one VEC_PERM_EXPR that will be
lowered by generic vec lowering is IMHO still better than 2.
Or even if we trade just one VEC_PERM_EXPR that can't be handled plus
one that can to one that can't be handled.

Lightly tested so far, ok for trunk if it passes full bootstrap/regtest
on x86_64-linux and i686-linux?

BTW, the testcase also needs to have executable permissions removed...

2022-10-20  <ja...@redhat.com>

        PR tree-optimization/54346
        * match.pd ((vec_perm (vec_perm@0 @1 @2 VECTOR_CST) @0 VECTOR_CST)):
        Optimize nested VEC_PERM_EXPRs even if target can't handle the
        new one provided we don't increase number of VEC_PERM_EXPRs the
        target can't handle.

        * gcc.dg/pr54346.c: Add -Wno-psabi to dg-options.
        
--- gcc/match.pd.jj     2022-10-19 11:28:35.111654555 +0200
+++ gcc/match.pd        2022-10-20 13:45:57.489512189 +0200
@@ -8118,7 +8118,16 @@ and,
        vec_perm_indices sel2 (builder2, 2, nelts);
 
        tree op0 = NULL_TREE;
-       if (can_vec_perm_const_p (result_mode, op_mode, sel2, false))
+       /* If the new VEC_PERM_EXPR can't be handled but both
+         original VEC_PERM_EXPRs can, punt.
+         If one or both of the original VEC_PERM_EXPRs can't be
+         handled and the new one can't be either, don't increase
+         number of VEC_PERM_EXPRs that can't be handled.  */
+       if (can_vec_perm_const_p (result_mode, op_mode, sel2, false)
+          || (single_use (@0)
+              ? (!can_vec_perm_const_p (result_mode, op_mode, sel0, false)
+                 || !can_vec_perm_const_p (result_mode, op_mode, sel1, false))
+              : !can_vec_perm_const_p (result_mode, op_mode, sel1, false)))
         op0 = vec_perm_indices_to_tree (TREE_TYPE (@4), sel2);
      }
      (if (op0)
--- gcc/testsuite/gcc.dg/pr54346.c.jj   2022-10-11 10:00:07.456124822 +0200
+++ gcc/testsuite/gcc.dg/pr54346.c      2022-10-20 13:46:10.933330119 +0200
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O -fdump-tree-dse1" } */
+/* { dg-options "-O -fdump-tree-dse1 -Wno-psabi" } */
 
 typedef int veci __attribute__ ((vector_size (4 * sizeof (int))));
 

        Jakub

Reply via email to