On 7/16/25 8:38 AM, Paul-Antoine Arras wrote:
This pattern enables the combine pass (or late-combine, depending on the case)
to merge a float_extend'ed vec_duplicate into a (possibly negated) minus-mult
RTL instruction.

Before this patch, we have six instructions, e.g.:
   vsetivli       zero,4,e32,m1,ta,ma
   fcvt.s.h       fa5,fa5
   vfmv.v.f       v4,fa5
   vfwcvt.f.f.v   v1,v3
   vsetvli        zero,zero,e32,m1,ta,ma
   vfnmadd.vv     v1,v4,v2

After, we get only one:
   vfwnmacc.vf     v1,fa5,v2

        PR target/119100

gcc/ChangeLog:

        * config/riscv/autovec-opt.md (*vfwnmacc_vf_<mode>): New pattern.
        (*vfwnmsac_vf_<mode>): New pattern.
        * config/riscv/riscv.cc (get_vector_binary_rtx_cost): Add support for a
        vec_duplicate in a neg.

gcc/testsuite/ChangeLog:

        * gcc.target/riscv/rvv/autovec/vx_vf/vf-1-f16.c: Add vfwnmacc and
        vfwnmsac.
        * gcc.target/riscv/rvv/autovec/vx_vf/vf-1-f32.c: Likewise.
        * gcc.target/riscv/rvv/autovec/vx_vf/vf-2-f16.c: Likewise.
        * gcc.target/riscv/rvv/autovec/vx_vf/vf-2-f32.c: Likewise.
        * gcc.target/riscv/rvv/autovec/vx_vf/vf-3-f16.c: Likewise.
        * gcc.target/riscv/rvv/autovec/vx_vf/vf-3-f32.c: Likewise.
        * gcc.target/riscv/rvv/autovec/vx_vf/vf-4-f16.c: Likewise.
        * gcc.target/riscv/rvv/autovec/vx_vf/vf-4-f32.c: Likewise.
        * gcc.target/riscv/rvv/autovec/vx_vf/vf_vfwnmacc-run-1-f16.c: New test.
        * gcc.target/riscv/rvv/autovec/vx_vf/vf_vfwnmacc-run-1-f32.c: New test.
        * gcc.target/riscv/rvv/autovec/vx_vf/vf_vfwnmsac-run-1-f16.c: New test.
        * gcc.target/riscv/rvv/autovec/vx_vf/vf_vfwnmsac-run-1-f32.c: New test.
OK.  I'll push it momentarily.
jeff

Reply via email to