[Bug rtl-optimization/44025] Multiple load 0 to register
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44025 --- Comment #7 from Steven Bosscher 2012-07-24 18:16:17 UTC --- Prototype patch attached to this email: http://gcc.gnu.org/ml/gcc/2012-07/msg00189.html It's not a finished patch, but it should be a good starting point for anyone who really wants to fix this problem: bar4: @ args = 0, pretend = 0, frame = 0 @ frame_needed = 0, uses_anonymous_args = 0 mov r3, r0 ldr r0, [r0, #520] push{r4, r5, r6, lr} mov r2, r3 mov r1, r3 addsr0, r3, r0 movsr6, #0 b .L9 .L4: ldrbr5, [r2], #1@ zero_extendqisi2 cmp r5, #10 bne .L9 mov r1, r2 movsr4, #0 strbr6, [r2, #-1] .L9: subsr5, r0, r2 cmp r5, #0 bgt .L4 cmp r1, r3 bne .L5 strbr4, [r1, #512] pop {r4, r5, r6, pc} .L10: subsr4, r1, r3 mov r0, r3 mov r2, r4 bl memmove mov r3, r0 .L11: str r4, [r3, #520] pop {r4, r5, r6, pc} .L5: ldr r2, [r3, #520] addsr2, r3, r2 cmp r1, r2 bcs .L11 b .L10
[Bug rtl-optimization/44025] Multiple load 0 to register
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44025 --- Comment #6 from Steven Bosscher 2012-07-24 15:51:39 UTC --- (In reply to comment #4) > The questions are: > 1, why pre does not do such optimization; Because PRE (gcse.c) doesn't run at -Os.
[Bug rtl-optimization/44025] Multiple load 0 to register
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44025 --- Comment #5 from amker.cheng 2011-11-02 06:05:23 UTC --- Created attachment 25687 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=25687 reduced test case which can be handled by cse pass
[Bug rtl-optimization/44025] Multiple load 0 to register
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44025 --- Comment #4 from amker.cheng 2011-11-02 06:03:56 UTC --- I noticed that for attached reduced test case "reduced_test.c", cse pass can eliminate such redundant load constant instructions. But since cse works on extended basic block, rather than globally, it can do nothing for the original case. The questions are: 1, why pre does not do such optimization; 2, if pre does do the work, surely the live range of r0 is extended, which might harm the register allocation; Also I found the regcprop.c, which is a peephole pass eliminates redundant register moves. It should be able to work for redundant constant load insns if : a) extend it in a value numbering way, at least for these constant values; b) extend it in a global data analysis way; Such change might also impact the scheduling pass and I am not sure how is the benefit for common codes.
[Bug rtl-optimization/44025] Multiple load 0 to register
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44025 Richard Guenther changed: What|Removed |Added Target Milestone|4.6.1 |---
[Bug rtl-optimization/44025] Multiple load 0 to register
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44025 Jakub Jelinek changed: What|Removed |Added Target Milestone|4.6.0 |4.6.1 --- Comment #3 from Jakub Jelinek 2011-03-25 19:53:06 UTC --- GCC 4.6.0 is being released, adjusting target milestone.
[Bug rtl-optimization/44025] Multiple load 0 to register
--- Comment #2 from ramana at gcc dot gnu dot org 2010-09-01 10:34 --- I'm not sure where this will be handled but I can see this with trunk today. -- ramana at gcc dot gnu dot org changed: What|Removed |Added Severity|normal |enhancement Status|UNCONFIRMED |NEW Ever Confirmed|0 |1 Keywords||missed-optimization Last reconfirmed|-00-00 00:00:00 |2010-09-01 10:34:47 date|| Target Milestone|--- |4.6.0 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44025
[Bug rtl-optimization/44025] Multiple load 0 to register
--- Comment #1 from carrot at google dot com 2010-05-07 13:19 --- Created an attachment (id=20596) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20596&action=view) test case -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44025