Following the instruction cost fix, we are generating alsl.w $a0, $a0, $a0, 4
instead of li.w $t0, 17 mul.w $a0, $t0 for "x * 4", because alsl.w is 4 times faster than mul.w. But we didn't have a sign-extending pattern for alsl.w, causing an extra slli.w instruction generated to sign-extend $a0. Add the pattern to remove the redundant extension. gcc/ChangeLog: * config/loongarch/loongarch.md (alslsi3_extend): New define_insn. --- gcc/config/loongarch/loongarch.md | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/gcc/config/loongarch/loongarch.md b/gcc/config/loongarch/loongarch.md index afbf201d4d0..7b26d15aa4e 100644 --- a/gcc/config/loongarch/loongarch.md +++ b/gcc/config/loongarch/loongarch.md @@ -2869,6 +2869,18 @@ (define_insn "alsl<mode>3" [(set_attr "type" "arith") (set_attr "mode" "<MODE>")]) +(define_insn "alslsi3_extend" + [(set (match_operand:DI 0 "register_operand" "=r") + (sign_extend:DI + (plus:SI + (ashift:SI (match_operand:SI 1 "register_operand" "r") + (match_operand 2 "const_immalsl_operand" "")) + (match_operand:SI 3 "register_operand" "r"))))] + "" + "alsl.w\t%0,%1,%3,%2" + [(set_attr "type" "arith") + (set_attr "mode" "SI")]) + ;; Reverse the order of bytes of operand 1 and store the result in operand 0. -- 2.43.0