https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64785

--- Comment #7 from Oleg Endo <olegendo at gcc dot gnu.org> ---
(In reply to Kazumoto Kojima from comment #6)
> 
> I like your pre-RA pass even if it's a too big hammer for
> this specific problem.  It should wait the next stage1, though.

It seems that this PR's issue is not a frequent use case (no hits in CSiBE at
all).  So yes, stage1 is good.


> Also it would be better to look for another use cases of that
> pass as you suggested so as to justify the cost of scanning
> all insns.

Some use cases for the pre-RA pass:
- R0 pre-allocation

- reduction of number of pseudos and reg-reg copies
  some passes leave pseudos and copies which can be removed
  to make the RA task easier.

- 2 operand / commutative operands optimization
  on SH the dest operand is always one of the source operands.
  I've seen several times that the generic RA makes not-so-good
  choices which results in more live regs and unnecessary reg-reg
  copies.  Very often output operands are put in different pseudos
  than the input operands before RA and RA has to fix this somehow.

- the last time I played with the fipr insn (PR 55295) RA had trouble
  allocating FV regs.  For example:
     void func (float a, float b, float c, float d)
  would not allocate (a,b,c,d) to FV4, although the operands are already
  in the appropriate FR regs.  It resulted in many unnecessary reg-reg
  copies.  I haven't tried this with LRA though.


There are some more things which I'd do before RA:

- Forming SH2A movu.{b|w} insns (PR 64792)
- Various constant optimizations (PR 63390, PR 51708, PR 54682, PR 65069)
- 64 bit FP load/store fusion (PR 64305)

It would be possible to write one huge pre-RA RTL pass to do all of that. 
However, I'd like to avoid accidents such as reload.c and rather keep things
separated as much as possible.  I don't have evidence, but I don't think that
scanning all insns is too bad.  It's being done multiple times during
compilation and there are other places which could be optimized.  For example,
as far as I know, split4 and split5 passes are not needed on SH and could be
disabled.  Or maybe the conditions in define_split such as "can_create_pseudo_p
()" should be evaluated *before* all insns are scanned/recog'ed.

Reply via email to