On 6/21/23 05:12, Juzhe-Zhong wrote:
This patch adds RVV floating-point auto-vectorization.
Also, fix attribute bug of floating-point ternary operations in vector.md.

gcc/ChangeLog:

         * config/riscv/autovec.md (fma<mode>4): New pattern.
         (*fma<mode>): Ditto.
         (fnma<mode>4): Ditto.
         (*fnma<mode>): Ditto.
         (fms<mode>4): Ditto.
         (*fms<mode>): Ditto.
         (fnms<mode>4): Ditto.
         (*fnms<mode>): Ditto.
         * config/riscv/riscv-protos.h (emit_vlmax_fp_ternary_insn): New 
function.
         * config/riscv/riscv-v.cc (emit_vlmax_fp_ternary_insn): Ditto.
         * config/riscv/vector.md: Fix attribute bug.

gcc/testsuite/ChangeLog:

         * gcc.target/riscv/rvv/autovec/ternop/ternop-1.c: Add floating-point 
teranary tests.
         * gcc.target/riscv/rvv/autovec/ternop/ternop-2.c: Ditto.
         * gcc.target/riscv/rvv/autovec/ternop/ternop-3.c: Ditto.
         * gcc.target/riscv/rvv/autovec/ternop/ternop-4.c: Ditto.
         * gcc.target/riscv/rvv/autovec/ternop/ternop-5.c: Ditto.
         * gcc.target/riscv/rvv/autovec/ternop/ternop-6.c: Ditto.
         * gcc.target/riscv/rvv/autovec/ternop/ternop_run-1.c: Ditto.
         * gcc.target/riscv/rvv/autovec/ternop/ternop_run-2.c: Ditto.
         * gcc.target/riscv/rvv/autovec/ternop/ternop_run-3.c: Ditto.
         * gcc.target/riscv/rvv/autovec/ternop/ternop_run-4.c: Ditto.
         * gcc.target/riscv/rvv/autovec/ternop/ternop_run-5.c: Ditto.
         * gcc.target/riscv/rvv/autovec/ternop/ternop_run-6.c: Ditto.
         * gcc.target/riscv/rvv/autovec/ternop/ternop-10.c: New test.
         * gcc.target/riscv/rvv/autovec/ternop/ternop-11.c: New test.
         * gcc.target/riscv/rvv/autovec/ternop/ternop-12.c: New test.
         * gcc.target/riscv/rvv/autovec/ternop/ternop-7.c: New test.
         * gcc.target/riscv/rvv/autovec/ternop/ternop-8.c: New test.
         * gcc.target/riscv/rvv/autovec/ternop/ternop-9.c: New test.
         * gcc.target/riscv/rvv/autovec/ternop/ternop_run-10.c: New test.
         * gcc.target/riscv/rvv/autovec/ternop/ternop_run-11.c: New test.
         * gcc.target/riscv/rvv/autovec/ternop/ternop_run-12.c: New test.
         * gcc.target/riscv/rvv/autovec/ternop/ternop_run-7.c: New test.
         * gcc.target/riscv/rvv/autovec/ternop/ternop_run-8.c: New test.
         * gcc.target/riscv/rvv/autovec/ternop/ternop_run-9.c: New test.
         * gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-1.c: New test.
         * gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-10.c: New test.
         * gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-11.c: New test.
         * gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-12.c: New test.
         * gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-2.c: New test.
         * gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-3.c: New test.
         * gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-4.c: New test.
         * gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-5.c: New test.
         * gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-6.c: New test.
         * gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-7.c: New test.
         * gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-8.c: New test.
         * gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-9.c: New test.

---


+
+(define_insn_and_split "*fma<mode>"
+  [(set (match_operand:VF_AUTO 0 "register_operand"   "=vr, vr, ?&vr")
+       (fma:VF_AUTO
+         (match_operand:VF_AUTO 1 "register_operand" " %0, vr,   vr")
+         (match_operand:VF_AUTO 2 "register_operand" " vr, vr,   vr")
+         (match_operand:VF_AUTO 3 "register_operand" " vr,  0,   vr")))
+   (clobber (match_scratch:SI 4 "=r,r,r"))]
+  "TARGET_VECTOR"
+  "#"
+  "&& reload_completed"
+  [(const_int 0)]
+  {
+    PUT_MODE (operands[4], Pmode);
Maybe this has already been answered, but why not get the mode right in the expander & pattern as opposed to blindly changing it in the C fragment?

It's probably technically safe to do what you've done, mostly because by the time the C code runs, we've turned the scratch into a hard register and I suspect the code to allocate scratches probably constructs a new hard reg for each scratch rather than using a shared object.

But if we can, let's get the mode right from the beginning.

I'll note the existing define_insn_and_splits for integer FMAs have this same wart.


jeff

Reply via email to