https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114480

--- Comment #11 from Vladimir Makarov <vmakarov at gcc dot gnu.org> ---
My finding is that RA is not a problem for GCC speed with -O1 and up.

RA in -O0 does really consume a big portion of GCC compiler time.  The
biggest part of RA in -O0 is actually spent in life analysis.  It is
difficult to implement a modest RA w/o life analysis as it will
results in huge stack slot generation (not knowing pseudo lives
basically means allocating stack slot for each pseudo).

The problem with the test is a huge number of pseudos (or IRA
objects).  This results in a big sparse set (which can be hardly
placed in L3 cache) and bad cache behaviour.

I tried to use a bitmap instead of sparse set, but GCC crashed after
allocating 48GB memory.  Sbitmap works better and improves IRA time by
12%.  But it works worse for other more frequently use cases.

So I don't think that RA behaviour can be improved for this case.

Reply via email to