Hi,
As reported in pr57540, gcc chooses bad address mode, resulting in A)
invariant part of address expression is not kept or hoisted; b) additional
computation which should be encoded in address expression.  The reason is
when gcc runs into "addr+offset" (which is invalid) during expanding, it
pre-computes the entire address and accesses memory unit using "MEM[reg]".
Yet we can force addr into register and try to generate "reg+offset" which
is valid for targets like ARM.  By doing this, we can:
1) keep addr in loop invariant form and hoist it later;
2) saving additional computation by taking advantage of scaled addressing
mode;

This issue has substantial impact for ARM mode, and also benefits Thumb2
although not so much as ARM mode.  For example from the bug entry, assembly
code like:
        blt     .L3
.L5:
        add     lr, sp, #2064            ////loop invariant
        add     r2, r2, #1
        add     r3, lr, r3, asl #2
        ldr     r3, [r3, #-2064]
        cmp     r3, #0
        bge     .L5
        uxtb    r2, r2

can be optimized into:

        blt     .L3     
.L5:
        add     r2, r2, #1
        ldr     r3, [sp, r3, asl #2]
        cmp     r3, #0
        bge     .L5
        uxtb    r2, r2

Bootstrap and test on x86/arm, any comments?

Thanks.
bin

2013-06-13  Bin Cheng  <bin.ch...@arm.com>

        PR target/57540
        * emit-rtl.c (offset_address): Try to force ADDR into register and
        generate reg+offset if addr+offset is invalid.
Index: gcc/emit-rtl.c
===================================================================
--- gcc/emit-rtl.c      (revision 199949)
+++ gcc/emit-rtl.c      (working copy)
@@ -2175,15 +2175,20 @@ offset_address (rtx memref, rtx offset, unsigned H
 
   /* At this point we don't know _why_ the address is invalid.  It
      could have secondary memory references, multiplies or anything.
+     Yet we can try to force addr into register, in order to catch
+     the scaled addressing opportunity as "reg + scaled_offset".
 
-     However, if we did go and rearrange things, we can wind up not
+     Otherwise, if we did go and rearrange things, we can wind up not
      being able to recognize the magic around pic_offset_table_rtx.
      This stuff is fragile, and is yet another example of why it is
-     bad to expose PIC machinery too early.  */
+     bad to expose PIC machinery too early.  We may also wind up not
+     being able to recognize the scaled addressing pattern.
+
+     It won't hurt because the address here is invalid and we are
+     going to pre-compute it anyway.  */
   if (! memory_address_addr_space_p (GET_MODE (memref), new_rtx,
                                     attrs.addrspace)
-      && GET_CODE (addr) == PLUS
-      && XEXP (addr, 0) == pic_offset_table_rtx)
+      && GET_CODE (addr) == PLUS)
     {
       addr = force_reg (GET_MODE (addr), addr);
       new_rtx = simplify_gen_binary (PLUS, address_mode, addr, offset);

Reply via email to