On 7/16/25 8:38 AM, Paul-Antoine Arras wrote:
This pattern enables the combine pass (or late-combine, depending on the case) to merge a float_extend'ed vec_duplicate into a (possibly negated) minus-mult RTL instruction. Before this patch, we have six instructions, e.g.: vsetivli zero,4,e32,m1,ta,ma fcvt.s.h fa5,fa5 vfmv.v.f v4,fa5 vfwcvt.f.f.v v1,v3 vsetvli zero,zero,e32,m1,ta,ma vfnmadd.vv v1,v4,v2 After, we get only one: vfwnmacc.vf v1,fa5,v2 PR target/119100 gcc/ChangeLog: * config/riscv/autovec-opt.md (*vfwnmacc_vf_<mode>): New pattern. (*vfwnmsac_vf_<mode>): New pattern. * config/riscv/riscv.cc (get_vector_binary_rtx_cost): Add support for a vec_duplicate in a neg. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vf-1-f16.c: Add vfwnmacc and vfwnmsac. * gcc.target/riscv/rvv/autovec/vx_vf/vf-1-f32.c: Likewise. * gcc.target/riscv/rvv/autovec/vx_vf/vf-2-f16.c: Likewise. * gcc.target/riscv/rvv/autovec/vx_vf/vf-2-f32.c: Likewise. * gcc.target/riscv/rvv/autovec/vx_vf/vf-3-f16.c: Likewise. * gcc.target/riscv/rvv/autovec/vx_vf/vf-3-f32.c: Likewise. * gcc.target/riscv/rvv/autovec/vx_vf/vf-4-f16.c: Likewise. * gcc.target/riscv/rvv/autovec/vx_vf/vf-4-f32.c: Likewise. * gcc.target/riscv/rvv/autovec/vx_vf/vf_vfwnmacc-run-1-f16.c: New test. * gcc.target/riscv/rvv/autovec/vx_vf/vf_vfwnmacc-run-1-f32.c: New test. * gcc.target/riscv/rvv/autovec/vx_vf/vf_vfwnmsac-run-1-f16.c: New test. * gcc.target/riscv/rvv/autovec/vx_vf/vf_vfwnmsac-run-1-f32.c: New test.
OK. I'll push it momentarily. jeff