Given the specification in the GCC internals manual defines the
{u|s}dot_prod<m> standard name as taking "two signed elements of the
same mode, adding them to a third operand of wider mode", there is
currently ambiguity in the relationship between the mode of the first
two arguments and that of the third.

This vagueness means that, in theory, different modes may be
supportable in the third argument.  This flexibility would allow for a
given backend to add to the accumulator a different number of
vectorized products, e.g. A backend may provide instructions for both:

  accum += a[0] * b[0] + a[1] * b[1] + a[2] * b[2] + a[3] * b[3]

and

  accum += a[0] * b[0] + a[1] * b[1],

as is now seen in the SVE2.1 extension to AArch64.  In spite of the
aforementioned flexibility, modeling the dot-product operation as a
direct optab means that we have no way to encode both input and the
accumulator data modes into the backend pattern name, which prevents
us from harnessing this flexibility.

The purpose of this patch-series is therefore to remedy this current
shortcoming, moving the `dot_prod' from its current implementation as
a direct optab to an implementation where, as a conversion optab, we
are able to differentiate between dot products taking the same input
mode but resulting in a different output mode.

Regression-tested on x86_64, aarch64 and armhf.  I'd appreciate help
running relevant tests on the remaining architectures, i.e. arc, mips,
altivec and c6x to ensure I've not inadvertently broken anything for
those backends.

Victor Do Nascimento (10):
  optabs: Make all `*dot_prod_optab's modeled as conversions
  autovectorizer: Add basic support for convert optabs
  aarch64: Fix aarch64 backend-use of (u|s|us)dot_prod patterns.
  arm: Fix arm backend-use of (u|s|us)dot_prod patterns.
  i386: Fix dot_prod backend patterns for mmx and sse targets
  arc: Adjust dot-product backend patterns
  mips:  Adjust dot-product backend patterns
  altivec: Adjust dot-product backend patterns
  c6x:  Adjust dot-product backend patterns
  autovectorizer: Test autovectorization of different dot-prod modes.

 gcc/config/aarch64/aarch64-builtins.cc        | 71 ++++++++++++++
 gcc/config/aarch64/aarch64-simd-builtins.def  |  4 -
 gcc/config/aarch64/aarch64-simd.md            |  9 +-
 .../aarch64/aarch64-sve-builtins-base.cc      | 13 +--
 gcc/config/aarch64/aarch64-sve-builtins.cc    | 17 ++++
 gcc/config/aarch64/aarch64-sve-builtins.h     |  3 +
 gcc/config/aarch64/aarch64-sve.md             |  6 +-
 gcc/config/aarch64/aarch64-sve2.md            |  2 +-
 gcc/config/aarch64/iterators.md               |  1 +
 gcc/config/arc/simdext.md                     |  8 +-
 gcc/config/arm/arm-builtins.cc                | 95 +++++++++++++++++++
 gcc/config/arm/arm-protos.h                   |  3 +
 gcc/config/arm/arm.cc                         |  1 +
 gcc/config/arm/arm_neon_builtins.def          |  3 -
 gcc/config/arm/neon.md                        |  4 +-
 gcc/config/c6x/c6x.md                         |  2 +-
 gcc/config/i386/mmx.md                        | 30 +++---
 gcc/config/i386/sse.md                        | 47 +++++----
 gcc/config/mips/loongson-mmi.md               |  2 +-
 gcc/config/rs6000/altivec.md                  |  4 +-
 gcc/doc/md.texi                               | 18 ++--
 gcc/gimple-match-exports.cc                   | 18 ++++
 gcc/gimple-match.h                            |  2 +
 gcc/optabs.cc                                 |  3 +-
 gcc/optabs.def                                |  6 +-
 .../gcc.dg/vect/vect-dotprod-twoway.c         | 38 ++++++++
 .../aarch64/sme/vect-dotprod-twoway.c         | 25 +++++
 gcc/tree-vect-loop.cc                         |  1 +
 gcc/tree-vect-patterns.cc                     | 43 ++++++++-
 29 files changed, 399 insertions(+), 80 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/vect-dotprod-twoway.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/vect-dotprod-twoway.c

-- 
2.34.1

Reply via email to