https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69577

--- Comment #4 from Vladimir Makarov <vmakarov at gcc dot gnu.org> ---
(In reply to Jakub Jelinek from comment #3)
> This looks like a RA issue or backend bug, perhaps the r215450 change needs
> to be narrowed down?
> 
> In *.ira we have:
> (insn 3 2 4 2 (set (reg/v:V2TI 102 [ y ])
>         (mem/c:V2TI (reg/f:DI 16 argp) [1 y+0 S32 A256])) pr69577.c:5 1225
> {*movv2ti_internal}
>      (expr_list:REG_EQUIV (mem/c:V2TI (reg/f:DI 16 argp) [1 y+0 S32 A256])
>         (nil)))
> ...
> (insn 13 12 15 2 (set (subreg:TI (reg/v:V2TI 102 [ y ]) 0)
>         (reg:TI 106)) pr69577.c:6 84 {*movti_internal}
>      (expr_list:REG_DEAD (reg:TI 106)
>         (nil)))
> ...
> (insn 16 15 17 2 (set (reg:OI 108 [ y.0_11 ])
>         (subreg:OI (reg/v:V2TI 102 [ y ]) 0)) pr69577.c:7 83
> {*movoi_internal_avx}
>      (expr_list:REG_DEAD (reg/v:V2TI 102 [ y ])
>         (nil)))
> and in *.reload this is:
> (insn 3 2 4 2 (set (reg/v:V2TI 21 xmm0 [orig:102 y ] [102])
>         (mem/c:V2TI (plus:DI (reg/f:DI 7 sp)
>                 (const_int 96 [0x60])) [1 y+0 S32 A256])) pr69577.c:5 1225
> {*movv2ti_internal}
>      (expr_list:REG_EQUIV (mem/c:V2TI (reg/f:DI 16 argp) [1 y+0 S32 A256])
>         (nil)))
> ...
> (insn 13 12 57 2 (set (mem/c:TI (reg/f:DI 7 sp) [2 %sfp+-64 S16 A256])
>         (reg:TI 0 ax [106])) pr69577.c:6 84 {*movti_internal}
>      (nil))
> (insn 57 13 15 2 (set (reg:TI 21 xmm0 [orig:102 y ] [102])
>         (mem/c:TI (reg/f:DI 7 sp) [2 %sfp+-64 S16 A256])) pr69577.c:6 84
> {*movti_internal}
>      (nil))
> ...
> (insn 16 58 17 2 (set (mem/c:OI (plus:DI (reg/f:DI 7 sp)
>                 (const_int 32 [0x20])) [2 %sfp+-32 S32 A256])
>         (reg:OI 21 xmm0 [orig:102 y ] [102])) pr69577.c:7 83
> {*movoi_internal_avx}
>      (nil))
> 
> insn 13 is supposed to change just the low 128-bits of the 256-bit register
> and preserve the high 128 bits, but after reload the instruction doesn't do
> that anymore, it clears the high 128 bits instead.  So, either IRA or LRA
> should have prevented assigning pseudo 102 to ymm0 register because of this,
> or something should have transformed that into a vector insertion
> instruction instead.  Richard, can you please have a look?

I believe machine-dependent code is more responsible for it.  I remember some
discussion of it.  LRA follows RTL semantics where insn

(set (subreg reg 0) ...)

sets only part of subreg but x86-64 insn implementing it changes all reg.

It should be x86-64 insn changing only part of register.  On the other hand
when we don't use subreg, the currently generated insn should be used.  I don't
know how to better fix it in x86 machine-depended code.

If it is necessary we could introduce some hook in LRA to help dealing with the
two different situations.

Reply via email to