https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69577
--- Comment #4 from Vladimir Makarov <vmakarov at gcc dot gnu.org> --- (In reply to Jakub Jelinek from comment #3) > This looks like a RA issue or backend bug, perhaps the r215450 change needs > to be narrowed down? > > In *.ira we have: > (insn 3 2 4 2 (set (reg/v:V2TI 102 [ y ]) > (mem/c:V2TI (reg/f:DI 16 argp) [1 y+0 S32 A256])) pr69577.c:5 1225 > {*movv2ti_internal} > (expr_list:REG_EQUIV (mem/c:V2TI (reg/f:DI 16 argp) [1 y+0 S32 A256]) > (nil))) > ... > (insn 13 12 15 2 (set (subreg:TI (reg/v:V2TI 102 [ y ]) 0) > (reg:TI 106)) pr69577.c:6 84 {*movti_internal} > (expr_list:REG_DEAD (reg:TI 106) > (nil))) > ... > (insn 16 15 17 2 (set (reg:OI 108 [ y.0_11 ]) > (subreg:OI (reg/v:V2TI 102 [ y ]) 0)) pr69577.c:7 83 > {*movoi_internal_avx} > (expr_list:REG_DEAD (reg/v:V2TI 102 [ y ]) > (nil))) > and in *.reload this is: > (insn 3 2 4 2 (set (reg/v:V2TI 21 xmm0 [orig:102 y ] [102]) > (mem/c:V2TI (plus:DI (reg/f:DI 7 sp) > (const_int 96 [0x60])) [1 y+0 S32 A256])) pr69577.c:5 1225 > {*movv2ti_internal} > (expr_list:REG_EQUIV (mem/c:V2TI (reg/f:DI 16 argp) [1 y+0 S32 A256]) > (nil))) > ... > (insn 13 12 57 2 (set (mem/c:TI (reg/f:DI 7 sp) [2 %sfp+-64 S16 A256]) > (reg:TI 0 ax [106])) pr69577.c:6 84 {*movti_internal} > (nil)) > (insn 57 13 15 2 (set (reg:TI 21 xmm0 [orig:102 y ] [102]) > (mem/c:TI (reg/f:DI 7 sp) [2 %sfp+-64 S16 A256])) pr69577.c:6 84 > {*movti_internal} > (nil)) > ... > (insn 16 58 17 2 (set (mem/c:OI (plus:DI (reg/f:DI 7 sp) > (const_int 32 [0x20])) [2 %sfp+-32 S32 A256]) > (reg:OI 21 xmm0 [orig:102 y ] [102])) pr69577.c:7 83 > {*movoi_internal_avx} > (nil)) > > insn 13 is supposed to change just the low 128-bits of the 256-bit register > and preserve the high 128 bits, but after reload the instruction doesn't do > that anymore, it clears the high 128 bits instead. So, either IRA or LRA > should have prevented assigning pseudo 102 to ymm0 register because of this, > or something should have transformed that into a vector insertion > instruction instead. Richard, can you please have a look? I believe machine-dependent code is more responsible for it. I remember some discussion of it. LRA follows RTL semantics where insn (set (subreg reg 0) ...) sets only part of subreg but x86-64 insn implementing it changes all reg. It should be x86-64 insn changing only part of register. On the other hand when we don't use subreg, the currently generated insn should be used. I don't know how to better fix it in x86 machine-depended code. If it is necessary we could introduce some hook in LRA to help dealing with the two different situations.