Re: RFC: LRA for x86/x86-64 [0/9]

Vladimir Makarov Mon, 01 Oct 2012 18:12:19 -0700

On 12-10-01 4:24 PM, Steven Bosscher wrote:

On Mon, Oct 1, 2012 at 9:51 PM, Vladimir Makarov <vmaka...@redhat.com> wrote:

I think it's more important in this case to recognize Steven's real
point, which is that for an identical situation (IRA), and with an
identical patch author, we had similar bugs.  They were promised to be
worked on, and yet some of those regressions are still very much with
us.

That is not true.  I worked on many compiler time regression bugs. I remeber
one serious degradation of compilation time on all_cp2k_gfortran.f90.  I
solved the problem and make IRA working faster and generating much better
code than the old RA.


http://blog.gmane.org/gmane.comp.gcc.patches/month=20080501/page=15

About other two mentioned PRs by Steven:

PR26854.  I worked on this bug even when IRA was on the branch and make
again GCC with IRA 5% faster on this test than GCC with the old RA.

PR 54146 is 3 months old.  There were a lot work on other optimizations
before IRA became important.  It happens only 2 months ago. I had no time to
work on it but I am going to.

This is also not quite true, see PR37448, which shows the problems as
the test case for PR54146.

I worked on this PR too and did some progress but it was much less thanon other PRs.

Sometimes I think that I should have maintained the old RA to comparewith it and show that IRA is still a step ahead. Unfortunately,comparison with old releases has no sense now because more aggressiveoptimizations (inlining).

I just think scalability is a very important issue. If some pass or
algorithm scales bad on some measure, then users _will_ run into that
at some point and report bugs about it (if you're lucky enough to have
a user patient enough to sit out the long compile time :-) ). Also,
good scalability opens up opportunities. For example, historically GCC
has been conservative on inlining heuristics to avoid compile time
explosions. I think it's better to address the causes of that
explosion and to avoid introducing new potential bottlenecks.

As I wrote, scalability sometimes misleading as in case PR (a Schemeinterpreter) where 30% more compilation time with LRA is translated into15% decrease in code size and most probably better performance. Now wecan manage to achieve scalability with worse performance but that is notwhat user expects and even worse he did know it. It is even a biggerproblem IMO.

Ideally, we should have more scalable algorithms as fallback but weshould warn the programmer that worse performance algorithms are usedand he could achieve a better performance by dividing a huge functioninto several ones.

People sometimes see that RA takes a lot of compilation time but it is in
the nature of RA.  I'd recommend first to check how the old RA behaves and
then call it a degradation.

There's no question that RA is one of the hardest problems the
compiler has to solve, being NP-complete and all that. I like LRA's
iterative approach, but if you know you're going to solve a hard
problem with a number potentially expensive iterations, there's even
more reason to make scalability a design goal!

When I designed IRA I kept this goal in my mind too. Although my firstpriority was the performance. The old RA was ok when the registerpressure was not high. I knew aggressive inlining and LTO were coming.And my goal was to design RA generating a good code for bigger programswith much higher register pressure where the old RA drawbacks would beobvious.

As I said earlier in this thread, I was really looking forward to IRA
at the time you worked on it, because it is supposed to be a regional
allocator and I had expected that to mean it could, well, allocate
per-region which is usually very helpful for scalability (partition
your function and insert compensation code on strategically picked
region boundaries). But that's not what IRA has turned out to be.
(Instead, its regional nature is one of the reasons for its
scalability problems.)  IRA is certainly not worse than old global.c
in very many ways, and LRA looks like a well thought-through and
welcome replacement of old reload. But scalability is an issue in the
design of IRA and LRA looks to be the same in that regard.

Regional allocator means that allocation is made taking mostly a regioninto account to achieve a better allocation. But a good regional RA(e.g. Callahan-Koblenz or Intel fusion based RA) takes interactions withother regions too and such allocators might be even slower thannon-regional. Although I should say that in my impressions CK-allocator(I tried and implemented this about 6 years ago) is probably betterscalable than IRA but it generates worse code.

Re: RFC: LRA for x86/x86-64 [0/9]

Reply via email to