Analysis of optimization PRs

Mark Mitchell Sun, 30 Oct 2005 23:14:15 -0800

There are quite a few missed-optimization PRs on the serious-regression
list: 25 to be exact.  Clearly, performance is important; the
justification for all of the tree-ssa infrastructure is better
performance.  The point is just to have more passes; it's to have better
SPEC numbers, better CSiBe numbers, faster compilers, smaller kernels,
etc.  It's painful to give up hard-won gains.  So, we need to take
optimization regressions seriously.


Now, that said, we all know that optimization is a game of heursitics,
and that every release will regress on some test cases.  We can add a
new pass, which improves a set of benchmarks, but happens to pessimize
some relatively rare case.  Hey, it happens.

But, we should investigate these PRs, and figure out what's going wrong,
and why.  If it's really the scenario above, then the audit trail should
say something like:

"The problem is that we're now splitting instructions earlier, which
exposes more scheduling opprotunities, but combine is unable to combine
four instructions, so we don't reassemble the original form.  There's
nothing practical that we can do about this without giving up the 3%
SPEC improvement we got from the new splitters."

The key elements are that you've (a) figured out what is different
between this version and the one that worked better (the early
splitting), (b) you've explained why this can't easily be fixed
(combine's 3-insn limitation), and (c) you've justified not going
backwards (the 3% SPEC improvement).  Then, I can look at the PR, and
just close it out.  Or, you can.  There's no need to feel embarassed
that our compiler isn't able to solve NP-hard problems.

But, without the anlaysis, we've got no way of knowing whether we've got
a shot at fixing the regressions.  "I want to say this is <some patch>"
or "This is just our register allocator being stupid" don't help.  Do
some real analysis; don't just guess.  Show other people waht's wrong;
if it's the register allocator being stupid, show the .lreg dump, and
the .greg dump, and show how it dumped the register used in the inner
loop, rather than the one in the outer loop.  Imagine you had to write
an essay that proved that you understood the problem.

This isn't just for my sake when reviewing the PR list; it's for the
sake of anyone who might want to fix the bug.  Optimizer folks, please
help with this -- your time might be as well spent analyzing some of
these PRs as implementing additional stuff.

Thanks,

-- 
Mark Mitchell
CodeSourcery, LLC
[EMAIL PROTECTED]
(916) 791-8304

Analysis of optimization PRs

Reply via email to