Hi! The pr54346.c testcase FAILs on i686-linux (without -msse*) for multiple reasons. One is the trivial missing -Wno-psabi which the following patch adds, but that isn't enough. The thing is that without native vector support, we have VEC_PERM_EXPRs in the IL and are actually considering the nested VEC_PERM_EXPRs into one VEC_PERM_EXPR optimization, but punt because can_vec_perm_const_p (result_mode, op_mode, sel2, false) is false.
Such a test makes sense to prevent "optimizing" two VEC_PERM_EXPRs that can be handled by the backend natively into one VEC_PERM_EXPR that can't be handled. But if both of the original VEC_PERM_EXPRs can't be handled natively either, having just one VEC_PERM_EXPR that will be lowered by generic vec lowering is IMHO still better than 2. Or even if we trade just one VEC_PERM_EXPR that can't be handled plus one that can to one that can't be handled. Lightly tested so far, ok for trunk if it passes full bootstrap/regtest on x86_64-linux and i686-linux? BTW, the testcase also needs to have executable permissions removed... 2022-10-20 <ja...@redhat.com> PR tree-optimization/54346 * match.pd ((vec_perm (vec_perm@0 @1 @2 VECTOR_CST) @0 VECTOR_CST)): Optimize nested VEC_PERM_EXPRs even if target can't handle the new one provided we don't increase number of VEC_PERM_EXPRs the target can't handle. * gcc.dg/pr54346.c: Add -Wno-psabi to dg-options. --- gcc/match.pd.jj 2022-10-19 11:28:35.111654555 +0200 +++ gcc/match.pd 2022-10-20 13:45:57.489512189 +0200 @@ -8118,7 +8118,16 @@ and, vec_perm_indices sel2 (builder2, 2, nelts); tree op0 = NULL_TREE; - if (can_vec_perm_const_p (result_mode, op_mode, sel2, false)) + /* If the new VEC_PERM_EXPR can't be handled but both + original VEC_PERM_EXPRs can, punt. + If one or both of the original VEC_PERM_EXPRs can't be + handled and the new one can't be either, don't increase + number of VEC_PERM_EXPRs that can't be handled. */ + if (can_vec_perm_const_p (result_mode, op_mode, sel2, false) + || (single_use (@0) + ? (!can_vec_perm_const_p (result_mode, op_mode, sel0, false) + || !can_vec_perm_const_p (result_mode, op_mode, sel1, false)) + : !can_vec_perm_const_p (result_mode, op_mode, sel1, false))) op0 = vec_perm_indices_to_tree (TREE_TYPE (@4), sel2); } (if (op0) --- gcc/testsuite/gcc.dg/pr54346.c.jj 2022-10-11 10:00:07.456124822 +0200 +++ gcc/testsuite/gcc.dg/pr54346.c 2022-10-20 13:46:10.933330119 +0200 @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O -fdump-tree-dse1" } */ +/* { dg-options "-O -fdump-tree-dse1 -Wno-psabi" } */ typedef int veci __attribute__ ((vector_size (4 * sizeof (int)))); Jakub