Hi!

On Mon, Nov 05, 2018 at 12:35:24PM +0000, Renlin Li wrote:
> >>--- a/gcc/combine.c
> >>+++ b/gcc/combine.c
> >>@@ -14998,6 +14998,8 @@ make_more_copies (void)
> >>        continue;
> >>      if (TEST_HARD_REG_BIT (fixed_reg_set, REGNO (src)))
> >>        continue;
> >>+     if (REG_P (dest) && TEST_HARD_REG_BIT (fixed_reg_set, REGNO 
> >>(dest)))
> >>+       continue;
> >>  
> >>      rtx new_reg = gen_reg_rtx (GET_MODE (dest));
> >>      rtx_insn *new_insn = gen_move_insn (new_reg, src);
> >>-- 1.8.3.1
> >It certainly helps the armeb test results.
> 
> Yes, I can also see it helps a lot with the regression test.
> Thanks for working on it!

I committed a variant that does this only for frame_pointer_rtx, as r265821.

> Beside the correctness issue, there are performance regression issues as 
> other people also reported.
> 
> I analysised a case, which is gcc.c-torture/execute/builtins/memcpy-chk.c
> In this case, two additional register moves and callee saves are emitted.
> 
> The problem is that, make_more_moves split a move into two. Ideally, the RA 
> could figure out and
> make the best register allocation. However, in reality, scheduler in some 
> cases will reschedule
> the instructions, and which changes the live-range of registers. And thus 
> change the interference graph
> of pseudo registers.
> 
> This will force the RA to choose a different register for it, and make the 
> move instruction not redundant,
> at least, not possible for RA to eliminate it.
> 
> For example,
> 
> set r102, r1
> 
> After combine:
> insn x: set r103, r1
> insn x+1: set r22, r103
> 
> After scheduler:
> insn x: set r103, r1
> ...
> ...
> ...
> insn x+1: set r102, r103
> 
> After IRA, r1 could be assigned to operands used in instructions in between 
> insn x and x+1.
> so r23 is conflicting with r1. LRA has to assign r23 a different hard 
> register.
> This cause one additional move, and probably one more callee save/restore.
> 
> Nothing is obviously wrong here. But...
> 
> One simple case probably not beneficial is to split hard register store.

You mean a store from a hard reg directly to memory?  Leaving that
constrains scheduling.

> According to your comment on make_more_moves, you might want to apply the 
> transformation only
> on hard-reg-to-pseudo-copy?

hard-reg-to-anything really.  Actually making it do this only for pseudos
caused a lot of degradation for some targets iirc.  I can do more tests.

Almost all reported degradations are cases where RA does not make the
best decision.  There are other cases where this combine change makes
better code than before.  (If this was not true, a greedy RA would work
well.)


Segher

Reply via email to