Richard Sandiford wrote:
Ian Lance Taylor <i...@google.com> writes:
Richard Sandiford <rdsandif...@googlemail.com> writes:
Does anyone else have any thoughts before I make that change?
I think that one of you should try to write a test case where it makes a
difference, and add the test case to the testsuite.
I originally took that to mean a case where function vs. bb speed choices
made a difference. That isn't really possible as things stand because
I don't know of any in-tree port that assigns different rtx costs to
SETs based on the speed setting.
But now I wonder whether you meant a test case where using rtx costs
makes a difference. I'm not really in a position to test ARM these days,
but it sounds like any testcase for the VUNZP patch would cover this too,
since it was this patch that prevented the VUNZP one from going in.
I'll also try to come up with a MIPS testcase when I look at that
(this weekend hopefully, but maybe not on recent form).
Anyway, I modified the patch to use a per-function speed setting.
After the off-list discussion between you and Kenny, I went ahead
and applied it after retesting on x86_64-linux-gnu.
To repeat: as things stand, very few targets define proper rtx costs
for SET. This patch is therefore expected to prevent lower-subreg
from running in cases where it's actually benefical. If you see that
happening, please check whether the rtx_costs are defined properly.
It's hardly possible to write proper rtx_costs for SET:
1) What should be the cost of (const_int 1) if you don't see the
machine mode? Is it QI, is it HI, is it SI or whatever?
There are platforms where this matters, for example the platform this
PR was initially reported for.
2) If the target will be a REG, what is the register class for the
assignment? rtx_costs are called after reload, so it would be good to
know. It would be good to know if it is a pseudo or hard reg.
And in many places the backend does not know where it is standing.
Is it upon expanding? Prior or after combine? Or split1?
3) Likewise, the costs of MEM are peeled of MEM and pass just
the address without any information on the MEM like it's address
space. Cost might highly depend on the address space involved.
The original PR is because split of mem:HI is fine -- if it reads
from generic. And splitting mem:HI is complete bloat for other
address spaces. Likewise for wider modes like PSI, SI, ...
ad 3) I wonder if the patch helps with the avr backend at all?
Does it improve the situation in any way? And is it worth to clean up
the avr backend and remove the FIXMEs there? I.e expand reads from MEM
as they are instead of hiding all inside UNSPECs?
Johann
Of course, if the costs are defined properly and lower-subreg still
makes the wrong choice, we need to look at why.
Richard