Compile the attached source code with options -Os -march=armv7-a -mthumb, gcc generates:
bar4: push {r3, r4, r5, lr} ldr r2, [r0, #520] mov r4, r0 mov r3, r0 mov r1, r0 movs r0, #0 // A b .L2 .L4: ldrb r5, [r3], #1 @ zero_extendqisi2 cmp r5, #10 itt eq moveq r1, r3 strbeq r0, [r3, #-1] subs r2, r2, #1 .L2: cmp r2, #0 bgt .L4 cmp r1, r4 bne .L8 movs r3, #0 //B strb r3, [r1, #512] //C b .L1 .L8: ldr r3, [r4, #520] adds r3, r4, r3 cmp r1, r3 bcc .L7 movs r3, #0 //D str r3, [r4, #520] //E b .L1 .L7: subs r5, r1, r4 mov r0, r4 mov r2, r5 bl memmove str r5, [r4, #520] .L1: pop {r3, r4, r5, pc} Instructions B load constant 0 into register r3, and instruction C store 0 into memory. Actually instruction A has already loaded 0 into register r0, and at instruction C it is still available, so we can use r0 directly in instruction C and remove B. Register r2 also contains 0 at instruction C, but it is more difficult to detect. R0 can also be used at instruction E and remove D. When compile with -O2 the result is similar. Should this be handled by any cse pass and rematerialize it if there is high register pressure? -- Summary: Multiple load 0 to register Product: gcc Version: 4.6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: carrot at google dot com GCC build triplet: i686-linux GCC host triplet: i686-linux GCC target triplet: arm-eabi http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44025