On 29/11/13 11:46, Yufeng Zhang wrote:
> On 11/29/13 07:52, Bin.Cheng wrote:
>> After thinking twice, I some kind of think we should not re-associate
>> addresses during expanding, because of lacking of context information.
>>   Take base + scaled_index + offset as an example in PR57540, we just
>> don't know if "base+offset" is loop invariant from either backend or
>> RTL expander.
> 
> I'm getting less convinced by re-associating base with offset 
> unconditionally.  One counter example is
> 
> typedef int arr_1[20];
> void foo (arr_1 a1, int i)
> {
>    a1[i+10] = 1;
> }
> 
> I'm experimenting a patch to get the immediate offset in the above 
> example to be the last addend in the address computing (as mentioned in 
> http://gcc.gnu.org/ml/gcc/2013-11/msg00581.html), aiming to get the 
> following code-gen:
> 
>          add     r1, r0, r1, asl #2
>          mov     r3, #1
>          str     r3, [r1, #40]
> 
> With your patch applied, the effort will be reverted to
> 
>          add     r0, r0, #40
>          mov     r3, #1
>          str     r3, [r0, r1, asl #2]
> 

And another one is:



typedef int arr_1[20];
void foo (arr_1 a1, int i)
{
   a1[i+10] = 1;
   a1[i+11] = 1;
}

This should compile to:

        add     r1, r0, r1, asl #2
        mov     r3, #1
        str     r3, [r1, #40]
        str     r3, [r1, #44]

And which on Thumb2 should then collapse to:

        add     r1, r0, r1, asl #2
        mov     r3, #1
        strd    r3, r3, [r1, #40]

With your patch I don't see any chance of being able to get to this
situation.

(BTW, we currently generate:

        mov     r3, #1
        add     r1, r1, #10
        add     r2, r0, r1, asl #2
        str     r3, [r0, r1, asl #2]
        str     r3, [r2, #4]

which is insane).

I think I see where you're coming from on the original testcase, but I
think you're trying to solve the wrong problem.   In your test case the
base is an eliminable register, which is likely to be replaced with an
offset expression during register allocation.  The problem then, I
think, is that the cost of these virtual registers is treated the same
as any other pseudo register, when it may really have the cost of a PLUS
expression.

Perhaps the cost of using an eliminable register should be raised in
rtx_costs() (treating them as equivalent to (PLUS (reg) (CONST_INT
(TBD))), so that loop optimizations will try to hoist suitable
sub-expressions out the loop and replace them with real pseudos.

R.








Reply via email to