Compile the following code with options -Os -march=armv7-a -mthumb extern long long foo(); void bar2(long long* p) { long long t = foo(); *p = t; }
GCC generates: bar2: push {r4, lr} mov r4, r0 bl foo mov r2, r0 // A mov r3, r1 // B strd r2, [r4] // C pop {r4, pc} 1. The register moves in instructions AB is a regression relate to gcc4.4.0. The result of gcc4.4 is: push {r4, lr} mov r4, r0 bl foo strd r0, [r4] pop {r4, pc} The regression may be caused by some changes in ira. Before ira both versions have similar rtx sequence: (call_insn 6 3 7 2 src/tb.c:4 (parallel [ (set (reg:DI 0 r0) (call (mem:SI (symbol_ref:SI ("foo") [flags 0x41] <function_decl 0x7f623ba6a600 foo>) [0 S4 A32]) (const_int 0 [0x0]))) (use (const_int 0 [0x0])) (clobber (reg:SI 14 lr)) ]) 255 {*call_value_insn} (nil) (nil)) (insn 7 6 8 2 src/tb.c:4 (set (reg/v:DI 133 [ t ]) (reg:DI 0 r0)) 657 {*thumb2_movdi} (nil)) (insn 8 7 0 2 src/tb.c:5 (set (mem:DI (reg/v/f:SI 134 [ p ]) [2 S8 A64]) (reg/v:DI 133 [ t ])) 657 {*thumb2_movdi} (expr_list:REG_DEAD (reg/v/f:SI 134 [ p ]) (expr_list:REG_DEAD (reg/v:DI 133 [ t ]) (nil)))) After ira, gcc4.5 generates: (call_insn 6 3 7 2 src/tb.c:4 (parallel [ (set (reg:DI 0 r0) (call (mem:SI (symbol_ref:SI ("foo") [flags 0x41] <function_decl 0x7f623ba6a600 foo>) [0 S4 A32]) (const_int 0 [0x0]))) (use (const_int 0 [0x0])) (clobber (reg:SI 14 lr)) ]) 255 {*call_value_insn} (nil) (nil)) (insn 7 6 8 2 src/tb.c:4 (set (reg/v:DI 2 r2 [orig:133 t ] [133]) (reg:DI 0 r0)) 657 {*thumb2_movdi} (expr_list:REG_EQUIV (mem:DI (reg/v/f:SI 4 r4 [orig:134 p ] [134]) [2 S8 A64]) (nil))) (insn 8 7 11 2 src/tb.c:5 (set (mem:DI (reg/v/f:SI 4 r4 [orig:134 p ] [134]) [2 S8 A64]) (reg/v:DI 2 r2 [orig:133 t ] [133])) 657 {*thumb2_movdi} (nil)) But gcc4.4 generates: (call_insn 6 3 8 2 src/tb.c:4 (parallel [ (set (reg:DI 0 r0) (call (mem:SI (symbol_ref:SI ("foo") [flags 0x41] <function_decl 0xf7d1c880 foo>) [0 S4 A32]) (const_int 0 [0x0]))) (use (const_int 0 [0x0])) (clobber (reg:SI 14 lr)) ]) 257 {*call_value_insn} (nil) (nil)) (insn 8 6 16 2 src/tb.c:5 (set (mem:DI (reg/v/f:SI 4 r4 [orig:134 p ] [134]) [2 S8 A64]) (reg/v:DI 0 r0 [orig:133 t ] [133])) 651 {*thumb2_movdi} (nil)) 2. Since r4 is never used again after instruction C, it can also be written as a stm instruction. In thumb2, strd is a 32bit instruction, but stm is 16 bit. -- Summary: Extra register move Product: gcc Version: 4.5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: regression AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: carrot at google dot com GCC build triplet: i686-linux GCC host triplet: i686-linux GCC target triplet: arm-eabi http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43616