https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108039
Bug ID: 108039 Summary: Unnecessary extension on riscv-64 Product: gcc Version: 13.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: law at gcc dot gnu.org CC: rzinsly at ventanamicro dot com Target Milestone: --- Target: riscv-64 Compile the following code for rv64 with -O2: typedef signed long int int64_t; void replace_weaker_arc(int *id1, int *id2, int64_t number) { *id1 = number; *id2 = number; } We generate: replace_weaker_arc: sext.w a2,a2 sw a2,0(a0) sw a2,0(a1) ret The key insns (from cse1 dump) are: (insn 8 5 9 2 (set (reg:DI 134 [ _1 ]) (sign_extend:DI (subreg:SI (reg/v:DI 137 [ number ]) 0))) "j.c":4:10 116 {extendsidi2} (nil)) (insn 9 8 10 2 (set (mem:SI (reg/v/f:DI 135 [ id1 ]) [1 *id1_4(D)+0 S4 A32]) (subreg/s/u:SI (reg:DI 134 [ _1 ]) 0)) "j.c":4:10 178 {*movsi_internal} (nil)) (insn 10 9 0 2 (set (mem:SI (reg/v/f:DI 136 [ id2 ]) [1 *id2_6(D)+0 S4 A32]) (subreg/s/u:SI (reg:DI 134 [ _1 ]) 0)) "j.c":5:10 178 {*movsi_internal} (nil)) fwprop tries to propagate insn 8 into insns 9 and 10, but that fails the complexity check: cannot propagate from insn 8 into insn 9: would increase complexity of pattern cannot propagate from insn 8 into insn 10: would increase complexity of pattern But propagation in this case allows us to eliminate a subreg & extension, so it's profitable. I haven't tested it, but something like this captures the RTL propagation generates and the fact that it should simplify. We may need to tighten it a little by verifying modes: diff --git a/gcc/fwprop.cc b/gcc/fwprop.cc index fc652ab9a1f..a86e9320908 100644 --- a/gcc/fwprop.cc +++ b/gcc/fwprop.cc @@ -258,6 +258,14 @@ fwprop_propagation::classify_result (rtx old_rtx, rtx new_rtx) return CONSTANT | PROFITABLE; } + /* Allow replacements where we are able to eliminate a + (subreg (any_extend (...)). */ + if (GET_CODE (old_rtx) == SUBREG + && (GET_CODE (SUBREG_REG (old_rtx)) == SIGN_EXTEND + || GET_CODE (SUBREG_REG (old_rtx)) == ZERO_EXTEND) + && XEXP (SUBREG_REG (old_rtx), 0) == new_rtx) + return PROFITABLE;