Hi Stefan, due to that missed optimization we currently generate silly code for these two tests and should really fix this (after gcc entering stage1). So just skipping it on s390x would definitely be the wrong choice I think.
I think our vectorize_vec_perm_const correctly rejects this permute pattern, since it would require a load from literal pool. Question is why we do have to rely on this being turned into a permute first to get rid of the obviously redundant assignments. Shouldn't fwprop be able to handle this without it? I'm ok with your patch, but please also open a BZ for it and perhaps mention it in the comment close to the xfail. Thanks! Andreas On 4/22/24 08:23, Stefan Schulze Frielinghaus wrote: > The tests fail on s390 since can_vec_perm_const_p fails and therefore > the bit insert/ref survive which r14-3381-g27de9aa152141e aims for. > Strictly speaking, the tests only fail in case the target supports > vectors, i.e., for targets prior z13 or in case of -mesa the emulated > vector operations are optimized out. > > Easiest would be to skip the entire test for s390. Another solution > would be to xfail in case of vector support hoping that eventually we > end up with an xpass for a future machine generation or if gcc advances. > That is implemented by this patch. In order to do so I implemented a > new target test s390_mvx which tests whether vector support is available > or not. Maybe this is already over-engineered for a simple test? Any > thoughts? > --- > gcc/testsuite/gcc.dg/tree-ssa/forwprop-40.c | 4 ++-- > gcc/testsuite/gcc.dg/tree-ssa/forwprop-41.c | 4 ++-- > gcc/testsuite/lib/target-supports.exp | 14 ++++++++++++++ > 3 files changed, 18 insertions(+), 4 deletions(-) > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/forwprop-40.c > b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-40.c > index 7513497f552..b67e3e93a7f 100644 > --- a/gcc/testsuite/gcc.dg/tree-ssa/forwprop-40.c > +++ b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-40.c > @@ -10,5 +10,5 @@ vector int g(vector int a) > return a; > } > > -/* { dg-final { scan-tree-dump-times "BIT_INSERT_EXPR" 0 "optimized" } } */ > -/* { dg-final { scan-tree-dump-times "BIT_FIELD_REF" 0 "optimized" } } */ > +/* { dg-final { scan-tree-dump-times "BIT_INSERT_EXPR" 0 "optimized" { xfail > s390_mvx } } } */ > +/* { dg-final { scan-tree-dump-times "BIT_FIELD_REF" 0 "optimized" { xfail > s390_mvx } } } */ > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/forwprop-41.c > b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-41.c > index b1e75797a90..0f119675207 100644 > --- a/gcc/testsuite/gcc.dg/tree-ssa/forwprop-41.c > +++ b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-41.c > @@ -11,6 +11,6 @@ vector int g(vector int a, int c) > return a; > } > > -/* { dg-final { scan-tree-dump-times "BIT_INSERT_EXPR" 1 "optimized" } } */ > -/* { dg-final { scan-tree-dump-times "BIT_FIELD_REF" 0 "optimized" } } */ > +/* { dg-final { scan-tree-dump-times "BIT_INSERT_EXPR" 1 "optimized" { xfail > s390_mvx } } } */ > +/* { dg-final { scan-tree-dump-times "BIT_FIELD_REF" 0 "optimized" { xfail > s390_mvx } } } */ > /* { dg-final { scan-tree-dump-times "VEC_PERM_EXPR" 0 "optimized" } } */ > diff --git a/gcc/testsuite/lib/target-supports.exp > b/gcc/testsuite/lib/target-supports.exp > index edce672c0e2..5a692baa8ef 100644 > --- a/gcc/testsuite/lib/target-supports.exp > +++ b/gcc/testsuite/lib/target-supports.exp > @@ -12380,6 +12380,20 @@ proc check_effective_target_profile_update_atomic {} > { > } "-fprofile-update=atomic -fprofile-generate"] > } > > +# Return 1 if the target has a vector facility. > +proc check_effective_target_s390_mvx { } { > + if ![istarget s390*-*-*] then { > + return 0; > + } > + > + return [check_no_compiler_messages_nocache s390_mvx assembly { > + #if !defined __VX__ > + #error no vector facility. > + #endif > + int dummy; > + } [current_compiler_flags]] > +} > + > # Return 1 if vector (va - vector add) instructions are understood by > # the assembler and can be executed. This also covers checking for > # the VX kernel feature. A kernel without that feature does not