Re: [PATCH 0/3] Power10 PCREL_OPT support

Bill Schmidt via Gcc-patches Sat, 22 Aug 2020 17:06:50 -0700

On 8/20/20 6:33 PM, Segher Boessenkool wrote:

Hi!


On Tue, Aug 18, 2020 at 02:31:41AM -0400, Michael Meissner wrote:

In order to do this, the pass that converts the load address and load/store
must occur late in the compilation cycle.

That does not follow afaics.

Let me see if I can help explain this.

I think the issue is that this optimization creates a dependency thatisn't directly represented in RTL. We either have to figure out how torepresent it, or we have to do this very late to avoid problems.

Suppose we are at a point where hard registers have been assigned, andthe RTL looks like:


    addi  r5,r3,4
    sldi  r6,r5,2
    pld  r10,symbol@got@pcrel
    lwz  r5,0(r10)

Everything is fine for the optimization to take place, since the twoinstructions are adjacent and therefore we can't have any problems withr10 being redefined in between, or r5 being used. So we stick on therelocation telling the linker to change this if resolved during staticlink time to:


    addi  r5,r3,4
    sldi  r6,r5,2
    plwz  r5,symbol@pcrel
    nop

Now, suppose after we insert the relocation we get a reordering ofinstructions such as


    addi  r5,r3,4
    pld  r10,symbol@got@pcrel
    sldi  r6,r5,2
    lwz  r5,0(r10)

When the linker performs the replacement, we will now end up with

    addi  r5,r3,4
    plwz  r5,symbol@pcrel
    sldi  r6,r5,2
    nop

which has altered the semantics of the program.

What is necessary in order to allow this optimization to occur earlieris to make this hidden dependency explicit. When the relocation isinserted, we have to change the "pld" instruction to have a specificclobber of (in this case) r5, which represents what will happen if thelinker makes the substitution.

I agree that it's too fragile to force this to be the last pass, so Ithink if Mike can look into introducing a clobber of the hard registerwhen performing the optimization, that would at least allow us to movethis anywhere after reload.

I don't immediately see a solution that works prior to registerallocation because we basically are representing two potential startingpoints of a live range, only one of which will survive in the finalcode. That is too ugly a problem to hand to the register allocator.


Thanks,
Bill

Re: [PATCH 0/3] Power10 PCREL_OPT support

Reply via email to