https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122749

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
A few other examples (though not regressions):
```
#include <arm_sve.h>

svint32_t f0(svint32_t a, svint32_t b, svint32_t c)
{
  return (a * b) + c;
}
svint32_t f(svuint32_t a, svuint32_t b, svint32_t c)
{
  return (svint32_t)(a * b) + c;
}

svint32_t f1(svint32_t a, svint32_t b, svint32_t c)
{
  svuint32_t aa = (svuint32_t)a;
  svuint32_t bb = (svuint32_t)b;
  return (svint32_t)(aa * bb) + c;
}


svuint32_t f2(svint32_t a, svint32_t b, svuint32_t c)
{
  return (svuint32_t)(a * b) + c;
}
```

f0 works as expected.
Note the above is about FMA but COND_FMA is similar.

Though I wonder for the unconditional fma we could not just have the pattern
which combine/fwprop tries:

(set (reg:VNx4SI 127 [ vect_x_12.16 ])
    (plus:VNx4SI (mult:VNx4SI (reg:VNx4SI 174 [ vect__4.12_45 ])
            (reg:VNx4SI 119 [ vect_vec_iv_.13 ]))
        (reg:VNx4SI 127 [ vect_x_12.16 ])))

fwprop does try it too:
```
propagating insn 8 into insn 9, replacing:
(set (reg:VNx4SI 107 [ _6 ])
    (plus:VNx4SI (reg:VNx4SI 108 [ _1 ])
        (reg/v:VNx4SI 106 [ cD.14130 ])))
failed to match this instruction:
(set (reg:VNx4SI 107 [ _6 ])
    (plus:VNx4SI (mult:VNx4SI (reg/v:VNx4SI 104 [ aD.14128 ])
            (reg/v:VNx4SI 105 [ bD.14129 ]))
        (reg/v:VNx4SI 106 [ cD.14130 ])))
```

With the extra clobber for the scratch and do a split.

This will at least give us the FMA but the it will use always true predicate
instead of the predicate of the loop which might be ok ...

Reply via email to