Hi Victor, > -----Original Message----- > From: Victor Do Nascimento <victor.donascime...@arm.com> > Sent: Wednesday, July 10, 2024 3:06 PM > To: gcc-patches@gcc.gnu.org > Cc: Richard Sandiford <richard.sandif...@arm.com>; Richard Earnshaw > <richard.earns...@arm.com>; Victor Do Nascimento > <vicdo...@e125768.arm.com> > Subject: [PATCH 10/10] autovectorizer: Test autovectorization of different > dot- > prod modes. > > From: Victor Do Nascimento <vicdo...@e125768.arm.com> > > Given the novel treatment of the dot product optab as a conversion we > are now able to target, for a given architecture, different > relationships between output modes and input modes. > > This is made clearer by way of example. Previously, on AArch64, the > following loop was vectorizable: > > uint32_t udot4(int n, uint8_t* data) { > uint32_t sum = 0; > for (int i=0; i<n; i+=1) > sum += data[i] * data[i]; > return sum; > } > > while the following wasn't: > > uint32_t udot2(int n, uint16_t* data) { > uint32_t sum = 0; > for (int i=0; i<n; i+=1) > sum += data[i] * data[i]; > return sum; > } > > Under the new treatment of the dot product optab, they are both now > vectorizable. > > This adds the relevant target-agnostic check to ensure this behaviour > in the autovectorizer. > > gcc/testsuite/ChangeLog: > > * gcc.dg/vect/vect-dotprod-twoway.c: New. > --- > .../gcc.dg/vect/vect-dotprod-twoway.c | 38 +++++++++++++++++++ > 1 file changed, 38 insertions(+) > create mode 100644 gcc/testsuite/gcc.dg/vect/vect-dotprod-twoway.c > > diff --git a/gcc/testsuite/gcc.dg/vect/vect-dotprod-twoway.c > b/gcc/testsuite/gcc.dg/vect/vect-dotprod-twoway.c > new file mode 100644 > index 00000000000..5caa7b81fce > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/vect/vect-dotprod-twoway.c > @@ -0,0 +1,38 @@ > +/* { dg-do compile } */ > +/* { dg-require-effective-target vect_int } */ > +/* Ensure both the two-way and four-way dot products are autovectorized. */ > +#include <stdint.h> > + > +uint32_t udot4(int n, uint8_t* data) { > + uint32_t sum = 0; > + for (int i=0; i<n; i+=1) { > + sum += data[i] * data[i]; > + } > + return sum; > +} > + > +int32_t sdot4(int n, int8_t* data) { > + int32_t sum = 0; > + for (int i=0; i<n; i+=1) { > + sum += data[i] * data[i]; > + } > + return sum; > +} > + > +uint32_t udot2(int n, uint16_t* data) { > + uint32_t sum = 0; > + for (int i=0; i<n; i+=1) { > + sum += data[i] * data[i]; > + } > + return sum; > +} > + > +int32_t sdot2(int n, int16_t* data) { > + int32_t sum = 0; > + for (int i=0; i<n; i+=1) { > + sum += data[i] * data[i]; > + } > + return sum; > +} > + > +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 4 "vect" } } */
These tests only test that you have vectorized the loops, not that the loop was vectorized using dotprod. I think you want to have a scan for DOT_PROD_EXPR as well, gated to the targets that support two-way dot prod. Cheers, Tamar > -- > 2.34.1