Re: [PATCH] match.pd: Allow FNMA fold through conversions

Andrew Pinski Wed, 04 Mar 2026 12:20:04 -0800

On Wed, Mar 4, 2026 at 12:12 AM
<[email protected]> wrote:
>
> From: Abhishek Kaushik <[email protected]>
>
> The FMA folds in match.pd currently only matches (negate @0) directly.
> When the negated operand is wrapped in a type conversion
> (e.g. (convert (negate @0))), the simplification to IFN_FNMA does not
> trigger.
>
> This prevents folding of patterns such as:
>
> *c = *c - (v8u)(*a * *b);
>
> when the multiply operands undergo vector type conversions before being
> passed to FMA. In such cases the expression lowers to neg + mla instead
> of the more optimal msb on AArch64 SVE, because the canonicalization
> step cannot see through the casts.
>
> Extend the match pattern to allow optional conversions on the negated
> operand and the second multiplicand:
>
> (fmas:c (convert? (negate @0)) (convert? @1) @2)
>
> and explicitly rebuild the converted operands in the IFN_FNMA
> replacement. This enables recognition of the subtraction-of-product form
> even when vector element type casts are present.
>
> With this change, AArch64 SVE code generation is able to select msb
> instead of emitting a separate neg followed by mla.
>
> This patch was bootstrapped and regression tested on aarch64-linux-gnu.
>
> gcc/
>         PR target/123897
>         * match.pd: Allow optional conversions in FMA-to-FNMA
>         canonicalization and reconstruct converted operands in
>         the replacement.
>
> gcc/testsuite/
>         PR target/123897
>         * gcc.target/aarch64/sve/fnma_match.c: New test.
>         * gcc.target/aarch64/sve/pr123897.c:
>         Update the test to scan for FNMA in the tree dump.
> ---
>  gcc/match.pd                                  |  4 +--
>  .../gcc.target/aarch64/sve/fnma_match.c       | 28 +++++++++++++++++++
>  .../gcc.target/aarch64/sve/pr123897.c         |  3 +-
>  3 files changed, 32 insertions(+), 3 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/fnma_match.c
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 7f16fd4e081..4cce9463f8f 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -10255,8 +10255,8 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>  (if (canonicalize_math_after_vectorization_p ())
>   (for fmas (FMA)
>    (simplify
> -   (fmas:c (negate @0) @1 @2)
> -   (IFN_FNMA @0 @1 @2))
> +   (fmas:c (convert? (negate @0)) (convert? @1) @2)
> +   (IFN_FNMA (convert @0) (convert @1) @2))


I think you need to check the types are nop conversions rather than
just convert.
So using nop_convert here would be better instead of adding the
tree_nop_conversion_p check.
Can you check if using nop_convert would work?

Thanks,
Andrew


>    (simplify
>     (fmas @0 @1 (negate @2))
>     (IFN_FMS @0 @1 @2))
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fnma_match.c 
> b/gcc/testsuite/gcc.target/aarch64/sve/fnma_match.c
> new file mode 100644
> index 00000000000..08607b172e2
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/fnma_match.c
> @@ -0,0 +1,28 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -march=armv9-a -msve-vector-bits=256" } */
> +
> +typedef __attribute__((__vector_size__(sizeof(int)*8))) signed int v8i;
> +typedef __attribute__((__vector_size__(sizeof(int)*8))) unsigned int v8u;
> +
> +void g(v8i *a,v8i *b,v8u *c)
> +{
> +  *c = *c - (v8u)(*a * *b);
> +}
> +
> +void h(v8u *a,v8u *b,v8i *c)
> +{
> +  *c = *c - (v8i)(*a * *b);
> +}
> +
> +void x(v8i *a,v8i *b,v8i *c)
> +{
> +  *c = *c - (*a * *b);
> +}
> +
> +void y(v8u *a,v8u *b,v8u *c)
> +{
> +  *c = *c - (*a * *b);
> +}
> +
> +/* { dg-final { scan-assembler-times "\\tmsb\\t" 4 } } */
> +/* { dg-final { scan-assembler-not "\\tneg\\t" } } */
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr123897.c 
> b/gcc/testsuite/gcc.target/aarch64/sve/pr123897.c
> index d74efabb7f8..45bc52522a9 100644
> --- a/gcc/testsuite/gcc.target/aarch64/sve/pr123897.c
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr123897.c
> @@ -13,4 +13,5 @@ void g(v8i *a,v8i *b,v8u *c)
>    *c = *c - (v8u)(*a * *b);
>  }
>
> -/* { dg-final { scan-tree-dump-times "\.FMA" 2 "widening_mul" } } */
> +/* { dg-final { scan-tree-dump-times "\.FMA" 1 "widening_mul" } } */
> +/* { dg-final { scan-tree-dump-times "\.FNMA" 1 "widening_mul" } } */
> --
> 2.43.0
>

Re: [PATCH] match.pd: Allow FNMA fold through conversions

Reply via email to