On 24/06/2024 12:35, Alexandre Oliva wrote: > On Jun 21, 2024, Christophe Lyon <christophe.l...@linaro.org> wrote: > >>> How about mentioning Christophe's simplification in the commit log? > >> For the avoidance of doubt: it's OK for me (but you don't need to >> mention my name in fact ;-) > > Needing or not, I added it ;-) > >>>> be accepted. (int16_t)32768 >> (int16_t)16 must yield 0, not 1 (as >>>> before the fix). > >>> This is OK, but you might wish to revisit this statement before >>> committing. > > Oh, right, sorry, I messed it up. uint16_t was what I should have put > in there. int16_t would have overflown and invoked undefined behavior > to begin with, and I see it misled you far down the wrong path. > >>> I think the original bug was that we were losing the cast to short > > The problem was that the shift count saturated at 15. AFAIK sign > extension was not relevant. Hopefully the rewritten opening paragraph > below makes that clearer. I will put it in later this week barring > objections or further suggestions of improvement. Thanks,
A signed shift right on a 16-bit vector element by 15 would still yield -1; but ... > > > [testsuite] [arm] [vect] adjust mve-vshr test [PR113281] > > The test was too optimistic, alas. We used to vectorize shifts by > clamping the shift counts below the bit width of the types (e.g. at 15 > for 16-bit vector elements), but (uint16_t)32768 >> (uint16_t)16 is > well defined (because of promotion to 32-bit int) and must yield 0, > not 1 (as before the fix). That make more sense now. Thanks. > > Unfortunately, in the gimple model of vector units, such large shift > counts wouldn't be well-defined, so we won't vectorize such shifts any > more, unless we can tell they're in range or undefined. > > So the test that expected the vectorization we no longer performed > needs to be adjusted. Instead of nobbling the test, Richard Earnshaw > suggested annotating the test with the expected ranges so as to enable > the optimization, and Christophe Lyon suggested a further > simplification. > > > Co-Authored-By: Richard Earnshaw <richard.earns...@arm.com> > > for gcc/testsuite/ChangeLog > > PR tree-optimization/113281 > * gcc.target/arm/simd/mve-vshr.c: Add expected ranges. I think this is OK now. R. > --- > gcc/testsuite/gcc.target/arm/simd/mve-vshr.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/gcc/testsuite/gcc.target/arm/simd/mve-vshr.c > b/gcc/testsuite/gcc.target/arm/simd/mve-vshr.c > index 8c7adef9ed8f1..03078de49c65e 100644 > --- a/gcc/testsuite/gcc.target/arm/simd/mve-vshr.c > +++ b/gcc/testsuite/gcc.target/arm/simd/mve-vshr.c > @@ -9,6 +9,8 @@ > void test_ ## NAME ##_ ## SIGN ## BITS ## x ## NB (TYPE##BITS##_t * > __restrict__ dest, TYPE##BITS##_t *a, TYPE##BITS##_t *b) { \ > int i; \ > for (i=0; i<NB; i++) { \ > + if ((unsigned)b[i] >= (unsigned)(BITS)) > \ > + __builtin_unreachable(); \ > dest[i] = a[i] OP b[i]; > \ > } > \ > } > >