Hi,
In the PR code we have the somewhat rare case that we need to reload
a vector subreg of a scalar register, (subreg:V2HI (reg:DI)).
What complicates things is that the test is compiled with
-mrvv-vector-bits=zvl, so VLS-only mode.
Unfortunately, we can still get VLA-named modes that are actually VLS
modes (i.e. a constant number of units).
For moving real VLS modes we simply use
(define_insn_and_split "*mov<mode>"
[(set (match_operand:VLS_AVL_IMM 0 "reg_or_mem_operand" "=vr, m, vr")
(match_operand:VLS_AVL_IMM 1 "reg_or_mem_operand" " m,vr, vr"))]
Here, lra recognizes cycle danger, quickly switches to the memory
alternative and the resulting code is as expected - we perform a vector
load from that memory the DImode reg was spilled to.
For VLA (named) modes the mov insn is
(define_insn_and_split "*mov<V_FRACT:mode><P:mode>_lra"
[(set (match_operand:V_FRACT 0 "reg_or_mem_operand" "=vr, m,vr")
(match_operand:V_FRACT 1 "reg_or_mem_operand" " m,vr,vr"))
(clobber (match_scratch:P 2 "=&r,&r,X"))]
The extra clobber here is an optimization: For modes smaller than a full
register we want to store the actual size, rather than always the full
vector size. If that mode size happens to exceed 32, instead of using an
immediate we need to move it to a register so vsetvl can consume it.
As the second mov insn above has three operands lra never checks for cycle
danger and promptly creates a cycle :) This patch loosens the conditions on
the cycle check by allowing a third operand that is a clobber. As a
consequence, we also need to prevent scratch regs to be "written back".
I'm not 100% on this but figured I'll just send the patch out for further
comments.
Regtested and bootstrapped on x86 and aarch64. The power10 cfarm machine that
I normally use is currently unreachable. Regtested on rv64gcv_zvl512b.
Regards
Robin
PR rtl-optimization/123381
gcc/ChangeLog:
* lra-constraints.cc (process_alt_operands): Detect cycles in
three-operand moves with clobber.
(curr_insn_transform): Don't write back a scratch operand.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/pr123381.c: New test.
---
gcc/lra-constraints.cc | 12 +++++++++++-
.../gcc.target/riscv/rvv/autovec/pr123381.c | 11 +++++++++++
2 files changed, 22 insertions(+), 1 deletion(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr123381.c
diff --git a/gcc/lra-constraints.cc b/gcc/lra-constraints.cc
index 87b18f30c98..0d073724043 100644
--- a/gcc/lra-constraints.cc
+++ b/gcc/lra-constraints.cc
@@ -3315,7 +3315,15 @@ process_alt_operands (int only_alternative)
early_clobbered_nops[early_clobbered_regs_num++] = nop;
}
- if (curr_insn_set != NULL_RTX && n_operands == 2
+ if (curr_insn_set != NULL_RTX
+ /* Allow just two operands or three operands where the third
+ is a clobber. */
+ && (n_operands == 2
+ || (n_operands == 3
+ && GET_CODE (PATTERN (curr_insn)) == PARALLEL
+ && XVECLEN (PATTERN (curr_insn), 0) == 2
+ && GET_CODE (XVECEXP (PATTERN (curr_insn), 0, 1))
+ == CLOBBER))
/* Prevent processing non-move insns. */
&& (GET_CODE (SET_SRC (curr_insn_set)) == SUBREG
|| SET_SRC (curr_insn_set) == no_subreg_reg_operand[1])
@@ -4919,6 +4927,8 @@ curr_insn_transform (bool check_only_p)
&& find_reg_note (curr_insn, REG_UNUSED, old) == NULL_RTX
/* OLD can be an equivalent constant here. */
&& !CONSTANT_P (old)
+ /* No need to write back anything for a scratch. */
+ && GET_CODE (old) != SCRATCH
&& (!REG_P(old) || !ira_former_scratch_p (REGNO (old))))
{
start_sequence ();
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr123381.c
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr123381.c
new file mode 100644
index 00000000000..cc21b0feca4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr123381.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-Ofast -mcpu=xiangshan-kunminghu -mrvv-vector-bits=zvl
-fno-tree-ccp -fno-vect-cost-model" } */
+
+char c;
+
+void
+foo(short, short, short, short, int, int, int, int, long x)
+{
+ x /= *(_Complex short *)&x;
+ c = x;
+}
--
2.52.0