https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116625
Andrew Pinski <pinskia at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Last reconfirmed| |2024-09-06 Target Milestone|--- |15.0 Ever confirmed|0 |1 Status|UNCONFIRMED |NEW --- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> --- Confirmed. IRA is selecting r12 while before was selecting r4: Popping a0(r117,l0) -- assign reg 12 vs Popping a0(r117,l0) -- assign reg 4 I think this is fine. because we are saving one less register. Just happens in this case the shift is shorter if using r4 rather than r12. The pattern used in this case: ``` (define_insn "*arm_shiftsi3" [(set (match_operand:SI 0 "s_register_operand" "=l,l,r,r") (match_operator:SI 3 "shift_operator" [(match_operand:SI 1 "s_register_operand" "0,l,r,r") (match_operand:SI 2 "reg_or_int_operand" "l,M,M,r")]))] "TARGET_32BIT" "* return arm_output_shift(operands, 0);" [(set_attr "predicable" "yes") (set_attr "arch" "t2,t2,*,*") (set_attr "predicable_short_it" "yes,yes,no,no") (set_attr "length" "4") (set_attr "shift" "1") (set_attr "autodetect_type" "alu_shift_operator3")] ) ``` should have given priority to l which is the lower register but since this is one basic block, it is worse to save/restore the callee saved register rather than using lr here. In fact if we look at the scan-assembler for bitfield-4.c we see it tries to support r8-10/fp/ip but then misses out that the shift might not be a shorten shift (that is lsr vs lsrs and lsl vs lsls). /* { dg-final { scan-assembler "lsrs\t(r\[3-9\]|r10|fp|ip), \\1, #1.*blxns\t\\1" } } */ /* { dg-final { scan-assembler "lsls\t(r\[3-9\]|r10|fp|ip), \\1, #1.*blxns\t\\1" } } */ So yes this is just a testcase issue. Just needs a small testcase change. I think s/lsls/lsl(s)/ and s/lsrs/lsr(s)/ will fix the issue.