On Wed, Apr 14, 2021 at 12:46:43PM -0500, Segher Boessenkool wrote: > The REGNO checks work fine for pseudos as well. But, why does it do > this at all, instead of using match_dup? That should be clearer.
Because with the hard regs it has different modes, so match_dup wouldn't work. We are talking here about: Trying 7, 2 -> 8: 7: r98:SI=x0:SI<<0xb&0x7f800 REG_DEAD x0:QI 2: r96:SI=zero_extend(x0:QI) 8: r97:SI=r98:SI|r96:SI REG_DEAD r98:SI REG_DEAD r96:SI Failed to match this instruction: (set (reg:SI 97) (ior:SI (and:SI (ashift:SI (reg:SI 0 x0 [ i ]) (const_int 11 [0xb])) (const_int 522240 [0x7f800])) (zero_extend:SI (reg:QI 0 x0 [ i ])))) Failed to match this instruction: (set (reg:SI 97) (ior:SI (and:SI (ashift:SI (reg:SI 0 x0 [ i ]) (const_int 11 [0xb])) (const_int 522240 [0x7f800])) (and:SI (reg:SI 0 x0) (const_int 255 [0xff])))) Splitting with gen_split_28 (aarch64.md:4434) Successfully matched this instruction: (set (reg:SI 99) (zero_extend:SI (reg:QI 0 x0 [ i ]))) Successfully matched this instruction: (set (reg:SI 97) (ior:SI (ashift:SI (reg:SI 99) (const_int 11 [0xb])) (reg:SI 99))) match_dup means insn-recog.c calls rtx_equal_p and that returns false if the mode is not the same. Before combine the 3 insns are: (insn 2 4 3 2 (set (reg/v:SI 96 [ i ]) (zero_extend:SI (reg:QI 0 x0 [ i ]))) "pr100056.c":10:1 114 {*zero_extendqisi2_aarch64} (expr_list:REG_DEAD (reg:QI 0 x0 [ i ]) (nil))) (note 3 2 7 2 NOTE_INSN_FUNCTION_BEG) (insn 7 3 8 2 (set (reg:SI 98) (ashift:SI (reg/v:SI 96 [ i ]) (const_int 11 [0xb]))) "pr100056.c":11:17 691 {*aarch64_ashl_sisd_or_int_si3} (nil)) (insn 8 7 13 2 (set (reg:SI 97) (ior:SI (reg:SI 98) (reg/v:SI 96 [ i ]))) "pr100056.c":11:12 488 {iorsi3} (expr_list:REG_DEAD (reg:SI 98) (expr_list:REG_DEAD (reg/v:SI 96 [ i ]) (nil)))) and I must say I don't know if make_more_copies was meant to split insn 2 into (set (reg:QI pseudo) (reg:QI 0 x0)) and (set (reg/v:SI 96) (zero_extend:SI (reg:QI pseudo))) or not. > The point of make_more_copies is that the hard registers from function > arguments are not pushed down by combine into actual instructions. This > can be done by RA if it thinks that is a good idea, and not done if it > thinks it is a bad idea. Having combine usurp part of the register > allocators role is not a good idea. > > There are other reasons hard regs can still end up in RTL insns in > earlier RTL passes of course, but the other changes that went together > with make_more_copies stop combine from doing that a lot (the function > itself makes sure every hard reg is copied to a new pseudo, because > combining that trivial move (from that new pseudo to the pseudo it was > copying it to already!) can still be beneficial for other reasons, all > strange and pretty unhappy, but important on many targets). Jakub