I failed to make Pmode of the of operand. I have tried the following
clobber (match_dup_4) But it causes to many issues. I do many tries turns out only the current solution can work. juzhe.zh...@rivai.ai From: Jeff Law Date: 2023-06-21 23:15 To: Juzhe-Zhong; gcc-patches CC: kito.cheng; kito.cheng; palmer; palmer; rdapp.gcc Subject: Re: [PATCH] RISC-V: Support RVV floating-point ternary auto-vectorization On 6/21/23 05:12, Juzhe-Zhong wrote: > This patch adds RVV floating-point auto-vectorization. > Also, fix attribute bug of floating-point ternary operations in vector.md. > > gcc/ChangeLog: > > * config/riscv/autovec.md (fma<mode>4): New pattern. > (*fma<mode>): Ditto. > (fnma<mode>4): Ditto. > (*fnma<mode>): Ditto. > (fms<mode>4): Ditto. > (*fms<mode>): Ditto. > (fnms<mode>4): Ditto. > (*fnms<mode>): Ditto. > * config/riscv/riscv-protos.h (emit_vlmax_fp_ternary_insn): New > function. > * config/riscv/riscv-v.cc (emit_vlmax_fp_ternary_insn): Ditto. > * config/riscv/vector.md: Fix attribute bug. > > gcc/testsuite/ChangeLog: > > * gcc.target/riscv/rvv/autovec/ternop/ternop-1.c: Add floating-point > teranary tests. > * gcc.target/riscv/rvv/autovec/ternop/ternop-2.c: Ditto. > * gcc.target/riscv/rvv/autovec/ternop/ternop-3.c: Ditto. > * gcc.target/riscv/rvv/autovec/ternop/ternop-4.c: Ditto. > * gcc.target/riscv/rvv/autovec/ternop/ternop-5.c: Ditto. > * gcc.target/riscv/rvv/autovec/ternop/ternop-6.c: Ditto. > * gcc.target/riscv/rvv/autovec/ternop/ternop_run-1.c: Ditto. > * gcc.target/riscv/rvv/autovec/ternop/ternop_run-2.c: Ditto. > * gcc.target/riscv/rvv/autovec/ternop/ternop_run-3.c: Ditto. > * gcc.target/riscv/rvv/autovec/ternop/ternop_run-4.c: Ditto. > * gcc.target/riscv/rvv/autovec/ternop/ternop_run-5.c: Ditto. > * gcc.target/riscv/rvv/autovec/ternop/ternop_run-6.c: Ditto. > * gcc.target/riscv/rvv/autovec/ternop/ternop-10.c: New test. > * gcc.target/riscv/rvv/autovec/ternop/ternop-11.c: New test. > * gcc.target/riscv/rvv/autovec/ternop/ternop-12.c: New test. > * gcc.target/riscv/rvv/autovec/ternop/ternop-7.c: New test. > * gcc.target/riscv/rvv/autovec/ternop/ternop-8.c: New test. > * gcc.target/riscv/rvv/autovec/ternop/ternop-9.c: New test. > * gcc.target/riscv/rvv/autovec/ternop/ternop_run-10.c: New test. > * gcc.target/riscv/rvv/autovec/ternop/ternop_run-11.c: New test. > * gcc.target/riscv/rvv/autovec/ternop/ternop_run-12.c: New test. > * gcc.target/riscv/rvv/autovec/ternop/ternop_run-7.c: New test. > * gcc.target/riscv/rvv/autovec/ternop/ternop_run-8.c: New test. > * gcc.target/riscv/rvv/autovec/ternop/ternop_run-9.c: New test. > * gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-1.c: New test. > * gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-10.c: New test. > * gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-11.c: New test. > * gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-12.c: New test. > * gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-2.c: New test. > * gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-3.c: New test. > * gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-4.c: New test. > * gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-5.c: New test. > * gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-6.c: New test. > * gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-7.c: New test. > * gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-8.c: New test. > * gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-9.c: New test. > > --- > + > +(define_insn_and_split "*fma<mode>" > + [(set (match_operand:VF_AUTO 0 "register_operand" "=vr, vr, ?&vr") > + (fma:VF_AUTO > + (match_operand:VF_AUTO 1 "register_operand" " %0, vr, vr") > + (match_operand:VF_AUTO 2 "register_operand" " vr, vr, vr") > + (match_operand:VF_AUTO 3 "register_operand" " vr, 0, vr"))) > + (clobber (match_scratch:SI 4 "=r,r,r"))] > + "TARGET_VECTOR" > + "#" > + "&& reload_completed" > + [(const_int 0)] > + { > + PUT_MODE (operands[4], Pmode); Maybe this has already been answered, but why not get the mode right in the expander & pattern as opposed to blindly changing it in the C fragment? It's probably technically safe to do what you've done, mostly because by the time the C code runs, we've turned the scratch into a hard register and I suspect the code to allocate scratches probably constructs a new hard reg for each scratch rather than using a shared object. But if we can, let's get the mode right from the beginning. I'll note the existing define_insn_and_splits for integer FMAs have this same wart. jeff