On 8/20/20 6:33 PM, Segher Boessenkool wrote:
Hi!
On Tue, Aug 18, 2020 at 02:31:41AM -0400, Michael Meissner wrote:
In order to do this, the pass that converts the load address and load/store
must occur late in the compilation cycle.
That does not follow afaics.
Let me see if I can help explain this.
I think the issue is that this optimization creates a dependency that
isn't directly represented in RTL. We either have to figure out how to
represent it, or we have to do this very late to avoid problems.
Suppose we are at a point where hard registers have been assigned, and
the RTL looks like:
addi r5,r3,4
sldi r6,r5,2
pld r10,symbol@got@pcrel
lwz r5,0(r10)
Everything is fine for the optimization to take place, since the two
instructions are adjacent and therefore we can't have any problems with
r10 being redefined in between, or r5 being used. So we stick on the
relocation telling the linker to change this if resolved during static
link time to:
addi r5,r3,4
sldi r6,r5,2
plwz r5,symbol@pcrel
nop
Now, suppose after we insert the relocation we get a reordering of
instructions such as
addi r5,r3,4
pld r10,symbol@got@pcrel
sldi r6,r5,2
lwz r5,0(r10)
When the linker performs the replacement, we will now end up with
addi r5,r3,4
plwz r5,symbol@pcrel
sldi r6,r5,2
nop
which has altered the semantics of the program.
What is necessary in order to allow this optimization to occur earlier
is to make this hidden dependency explicit. When the relocation is
inserted, we have to change the "pld" instruction to have a specific
clobber of (in this case) r5, which represents what will happen if the
linker makes the substitution.
I agree that it's too fragile to force this to be the last pass, so I
think if Mike can look into introducing a clobber of the hard register
when performing the optimization, that would at least allow us to move
this anywhere after reload.
I don't immediately see a solution that works prior to register
allocation because we basically are representing two potential starting
points of a live range, only one of which will survive in the final
code. That is too ugly a problem to hand to the register allocator.
Thanks,
Bill