On 13-10-21 11:51 AM, Michael Meissner wrote:
On Sun, Oct 20, 2013 at 10:48:08PM -0400, Vladimir Makarov wrote:
On 13-10-18 11:26 AM, David Edelsohn wrote:
On Thu, Oct 3, 2013 at 5:02 PM, Vladimir Makarov <vmaka...@redhat.com> wrote:
The following patch permits today trunk to use LRA for ppc by default.
To switch it off -mno-lra can be used.

The patch was bootstrapped on ppc64.  GCC testsuite does not have
regressions too (in comparison with reload).  The change in rs6000.md is
for fix LRA failure on a recently added ppc test.
Vlad,

I have not forgotten this patch. We are trying to figure out the right
timeframe to make this change. The patch does affect performance --
both positively and negatively; most are in the noise but not all. And
there still are some SPEC benchmarks that fail to build with the
patch, at least in Mike's tests. And Mike is implementing some patches
to utilize reload to improve use of VSX registers, which would need to
be mirrored in LRA for the equivalent functionality.
Thanks for informing me, David.

I am ready to work on any LRA ppc issues when it will be in the
trunk.  It would be easier for me to work on LRA ppc if the patch is
committed to the trunk and of course LRA is used as non-default
local RA.

I don't know what Mike is doing on reload to use VSX registers.  I
guess it is usage of  VSX regs as spilled locations for GENERAL regs
instead of memory.  If it is so, it is 2 day work to add this
functionality in LRA (as it already has analogous functionality for
Intel processors and that gave a nice SPECFP2000 improvement for
them) and probably more work on resolving issues especially as I
have no power8.
I would say lets add -mlra, but make the default OFF for the time being.  We
can always switch the default later.
Sure, if you know some LRA problems it should not be on default. Moreover, if we still have the problems when releasing gcc4.9, I think we should exclude any possibility for a user to use LRA for ppc. I don't want to have GGC-4.9 users blaming LRA.

But adding LRA to PPC on the trunk (switched OFF by default) earlier could help me a lot to work on the issues.
Vladimir, I thought I included you in the list when I gave status.  The big
thing is several of the Spec 2006 benchmarks don't work in 32-bit mode, and I
get a lot of Fortran errors, again in 32-bit.  I also saw some decimal floating
point problems.
No, I did not see the message (or may be missed).  I need to check.
What I'm doing is adding secondary reload support so that up until reload time,
we can represent VSX addresses as reg+offset, and in secondary reload, create
the addition instructions to put the offset in a base register.  I haven't made
any changes to the machine independent portions of the compiler.  As long as
IRA uses the secondary reload interface, it should be ok.  However, right now,
I need to focus most of my attention on getting the secondary reload support to
work.
I completely understand. You are quite busy this time as me rushing some stuff into gcc-4.9.
One thing that I've asked for before, but to remind you, is I really, really
wish secondary reload could allocate two scratch registers if it is given an
insn that takes 4 arguments.  Right now, I'm allocating a TFmode scratch, since
that gives 2 registers, but future changes will want TFmode to go into a single
vector register, and I will need to create another type, like V4DI that does
take 2 registers.  The case that this is needed for is moving an item from GPRs
to VSX registers that takes 2 GPR registers, such as moving 128-bit items in
64-bit mode, or 64-bit items in 32-bit mode.  I need two registers to do the
move into, and then I will do the combine operation.

Ok. I guess LRA can be adapted to some new secondary_reload hook returning two scratch registers.

Reply via email to