On Mon, May 25, 2015 at 10:26:35AM +0930, Alan Modra wrote: > On Sat, May 23, 2015 at 08:45:23AM -0500, Segher Boessenkool wrote: > > Thanks. Did you see improvements around return as well, or mostly / > > only related to the function start? > > The rlwinm vs. rldicl change was in dwarf2out.c add_ranges_num for r3 > one insn before a blr. I'm sure that was due to not combining return > value copies. In fact, I'm sure all the improvements I saw were due > to changing the exit.. See below. > > > > +/* Twiddle BLOCK_FOR_INSN to TO for instructions in the first block BB > > > + that we don't want to combine with other instructions. */ > > > + > > > +static void > > > +twiddle_first_block (basic_block bb, basic_block to) > > > > I don't like this much. Messing with global state makes things harder > > to change later. If it is hard to come up with a good name for a > > factor, it usually means it is not such a good factor. > > > > You can do these inside can_combine_{def,use}_p as far as I can see? > > Probably need to give those an extra parameter then: for _def, a bool > > that says "don't allow moves from hard regs", set when the scan has > > encountered a NOTE_INSN_FUNCTION_BEG in this bb; and for _use, a regset > > of those hard regs we don't want to allow moves to (those seen in USEs > > later in that same block). Or do it in the main loop itself, _{def,use} > > are each called in one place only; whatever works best. > > Huh, that's the way I started writing this patch.. The reason I went > with modifying BLOCK_FOR_INSN is that the loop setting up LOG_LINKS > needs to test that anyway. Changing code in the main loop means > every insn in a function will need to be tested for additional > conditions. I thought that might be slower. I can see your point > though, especially if someone wanted to wean combine off LOG_LINKS.
The setup of the LOG_LINKS is a simple fast linear loop, much less work than the rest of combine. So don't worry about performance too much :-) > > What makes return values different from CALL args here, btw? Why do > > you not want to combine return values, but you leave alone call args? > > I don't think there is much difference between SETs for return values > and SETs for call args. The reason I deliberately left them out is > that in the original discussion we were talking about parameters and > return values. So I thought I'd better restrict the change to just > those SETs. > > It would be a much simpler patch to make combine ignore all SETs > that are a reg,reg copy with one of them a hard reg. I was a little > worried I'd regress some target if I tried that. _All_ moves to/from hard regs includes much more (register asm, fixed registers, maybe some targets do weird things as well?). So yes I share those worries. Since we do not want any other pass (before RA) to foul up these register moves either, maybe a better solution would be to mark them some way at expand time? > (Results on > powerpc64le-linux for such a change look good. A lot more cases where > code is better, due to catching the parameter reg,reg copies. In > looking at gcc/*.o I haven't yet seen any regressions in code quality.) That is good news :-) > > > +/* Fill in log links field for all insns that we wish to combine. */ > > > > Please don't change this comment; it was more correct before. > > But it wasn't correct! LOG_LINKS are not set up for *all* insns. It "sets" all LOG_LINKS to some value ("sets", it doesn't actually zero them at the start, it has an assert for that though). > > > @@ -1103,7 +1192,7 @@ create_log_links (void) > > > we might wind up changing the semantics of the insn, > > > even if reload can make what appear to be valid > > > assignments later. */ > > > - if (regno < FIRST_PSEUDO_REGISTER > > > + if (HARD_REGISTER_NUM_P (regno) > > > && asm_noperands (PATTERN (use_insn)) >= 0) > > > continue; > > > > An independent change. > > Yeah, I guess. I tidy these if I'm working on a piece of code. Pretty far away in this case ;-) > Here's the whole file done. Boostrapped and regression tested > powerpc64le-linux and x86_64-linux. > > * combine.c: Use HARD_REGISTER_P and HARD_REGISTER_NUM_P as > appropriate throughout file in place of comparing regno > against FIRST_PSEUDO_REGISTER. Have we decided we want to convert the whole compiler to this? It is a pretty lousy interface IMO: HARD_REGISTER_P does not check if its arg is a register; it does not check if its arg is a hard register either (it checks if it is not a pseudo); it cannot be used in all places in all passes because of that (which means, not in all macros and hooks either). Segher