Hi Victor,

> -----Original Message-----
> From: Victor Do Nascimento <victor.donascime...@arm.com>
> Sent: Wednesday, July 10, 2024 3:06 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Richard Sandiford <richard.sandif...@arm.com>; Richard Earnshaw
> <richard.earns...@arm.com>; Victor Do Nascimento
> <vicdo...@e125768.arm.com>
> Subject: [PATCH 10/10] autovectorizer: Test autovectorization of different 
> dot-
> prod modes.
> 
> From: Victor Do Nascimento <vicdo...@e125768.arm.com>
> 
> Given the novel treatment of the dot product optab as a conversion we
> are now able to target, for a given architecture, different
> relationships between output modes and input modes.
> 
> This is made clearer by way of example. Previously, on AArch64, the
> following loop was vectorizable:
> 
> uint32_t udot4(int n, uint8_t* data) {
>   uint32_t sum = 0;
>   for (int i=0; i<n; i+=1)
>     sum += data[i] * data[i];
>   return sum;
> }
> 
> while the following wasn't:
> 
> uint32_t udot2(int n, uint16_t* data) {
>   uint32_t sum = 0;
>   for (int i=0; i<n; i+=1)
>     sum += data[i] * data[i];
>   return sum;
> }
> 
> Under the new treatment of the dot product optab, they are both now
> vectorizable.
> 
> This adds the relevant target-agnostic check to ensure this behaviour
> in the autovectorizer.
> 
> gcc/testsuite/ChangeLog:
> 
>         * gcc.dg/vect/vect-dotprod-twoway.c: New.
> ---
>  .../gcc.dg/vect/vect-dotprod-twoway.c         | 38 +++++++++++++++++++
>  1 file changed, 38 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.dg/vect/vect-dotprod-twoway.c
> 
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-dotprod-twoway.c
> b/gcc/testsuite/gcc.dg/vect/vect-dotprod-twoway.c
> new file mode 100644
> index 00000000000..5caa7b81fce
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/vect-dotprod-twoway.c
> @@ -0,0 +1,38 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target vect_int } */
> +/* Ensure both the two-way and four-way dot products are autovectorized.  */
> +#include <stdint.h>
> +
> +uint32_t udot4(int n, uint8_t* data) {
> +  uint32_t sum = 0;
> +  for (int i=0; i<n; i+=1) {
> +    sum += data[i] * data[i];
> +  }
> +  return sum;
> +}
> +
> +int32_t sdot4(int n, int8_t* data) {
> +  int32_t sum = 0;
> +  for (int i=0; i<n; i+=1) {
> +    sum += data[i] * data[i];
> +  }
> +  return sum;
> +}
> +
> +uint32_t udot2(int n, uint16_t* data) {
> +  uint32_t sum = 0;
> +  for (int i=0; i<n; i+=1) {
> +    sum += data[i] * data[i];
> +  }
> +  return sum;
> +}
> +
> +int32_t sdot2(int n, int16_t* data) {
> +  int32_t sum = 0;
> +  for (int i=0; i<n; i+=1) {
> +    sum += data[i] * data[i];
> +  }
> +  return sum;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 4 "vect" } } */

These tests only test that you have vectorized the loops, not that the loop was 
vectorized
using dotprod.  I think you want to have a scan for DOT_PROD_EXPR as well, 
gated to the
targets that support two-way dot prod.

Cheers,
Tamar

> --
> 2.34.1

Reply via email to