Wilco Dijkstra <wilco.dijks...@arm.com> writes:
> Hi Richard,
>
>> But even if the costs are too high, the patch seems to be overcompensating.
>> It doesn't make logical sense for an ADRP+LDR to be cheaper than an LDR.
>
> An LDR is not a replacement for ADRP+LDR, you need a store in addition the
> original ADRP+LDR. Basically a simple spill would be comparing these 2 
> sequences:
>
> ADRP x0, ...
> LDR x0, [x0, ...]
> STR x0, [SP, ...]
> ...
> LDR x0, [SP, ...]
>
>
> ADRP x0, ...
> LDR x0, [x0, ...]
> ...
> ADRP x0, ...
> LDR x0, [x0, ...]
>
> Obviously it's far cheaper to do the latter than the former.

Sure.  Like I say, I'm not disagreeing with the intent of reducing
spilling and promoting rematerialisation.  I agree we should do that.

I'm just disagreeing with the approach of using rtx_costs.  The rtx_cost
hook isn't being asked the question: is spilling this better value than
rematerialising it?  It's being asked for the cost of an operation, on
the understanding that that cost will be compared with the cost of other
operations.  An ADRP+LDR operation then ought to be at least as costly
as an LDR, because in a two-way comparison, it is.

[…]

>> Maybe it would help to turn the question around for a minute.  Can we
>> describe the cases in which it's *better* for the RA to spill a constant
>> address to the stack and reload it, rather than rematerialise on demand?
>
> Rematerialization is almost always better than spilling and reloading from the
> stack. If the constant requires multiple instructions and there are more than 
> 2
> references it would be better for codesize to spill, but for performance it is
> better to rematerialize unless there are many references.
>
> You also want to prefer rematerialization over spilling a different liferange 
> when
> other aspects are comparable.

Yeah, that's what I thought the answer would be.  So the question is:
why is the RA choosing to spill and reload rather than rematerialise
these values?  Does it not know how to rematerialise them, and so we
rely on earlier passes not reusing the constants?  Or does the RA
know how but decides it isn't worthwhile, because of the way that
the RA uses the target costs?  If the latter, I would be much happier with
a new hook that allows the target to force the RA to rematerialise a given
value, if that's the heuristic we want to use when optimising for speed.

Thanks,
Richard

Reply via email to