On 22/04/15 10:39 AM, Georg-Johann Lay wrote:
Attached is a C test program which produces fine results with
$ avr-gcc -S -O2 -mmcu=atmega8
Also attached is a respective patch against the trunk avr backend that
indicates the transition from clobbers to hard-regs-by-constraint.
I don't actually remember when I tried this first; sometimes around
when 4.8 was in stage I or so.
If my recollection is right; the problem was not that small test
programs with mulsi3 produced large code, but that "ordinary" code
could get much worse. I had the impression it was because the bunch
of new, rarely used / rarely useful register classes, and that IRA's
cost computation got confused resp. much less accurate than with the
usual register classes (only 10 classes of GENERAL_REG).
The attached patch adds 27 new register classes, and to transform all
insns even more classes might be needed: 8-bit, 16-bit and 24-bit
multiplications including sign/zero extension of operands, fixed-point
functions from 8...32 bit, divmod, builtins implementations, support
functions for address spaces, ...
The insns which are using this all have the following properties in
common:
- Only 1 constraint alternative
- Register allocation is uniquely determined, i.e. reg allocator has
no choice what register to pick for what operand (except for
commutative constraints with '%' which give exactly 2 solutions).
The patch avoids clobbers or scratches altogether. The only insn
where a register is affected that is not the output, are transformed
from single_set to parallels in split1. The 2nd set describes setting
a (reg:HI 26) to a useless value. The insn is not expanded as
parallel, because insn combine won't use them for combinations.
Is there a chance that register allocation gets worse just because so
many register classes are added?
Thanks for providing the patch and test. Sometimes it is even for me
hard to say how the RA will behave in complicated cases without
investigating actual code. After looking at IRA dumps, I think you will
be okay.
When you use hard regs, pseudos moved to the hard regs got preferences
of the hard regs from the moves and possibility of the pseudos to get
the hard reg and moves to be eliminated will be high.
When you use one hard reg class constraints, the operand pseudo gets the
same hard reg preference. IRA has a code for this. So the final result
will be quite analogous.
The only difference is that RA (more accurately ira-costs.c code) will
be a bit slower as there are more reg classes.