https://gcc.gnu.org/bugzilla/show_bug.cgi?id=123214

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
             Target|                            |aarch64-*-* x86_64-*-*
   Last reconfirmed|                            |2025-12-19
     Ever confirmed|0                           |1

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
The issue is the usual register passing vs. memory RTL expansion for a TImode
structure and the inability to CSE/DSE on RTL.

postreload sees for example

(insn 27 7 4 2 (set (reg:TI 0 ax [orig:110 c ] [110])
        (reg:TI 1 dx [ c ])) "t.C":3:49 98 {*movti_internal}
     (expr_list:REG_EQUIV (mem/c:TI (plus:DI (reg/f:DI 19 frame)
                (const_int -16 [0xfffffffffffffff0])) [12 c+0 S16 A64])
        (nil)))
(note 4 27 5 2 NOTE_INSN_DELETED)
(insn 5 4 6 2 (set (mem/c:TI (plus:DI (reg/f:DI 7 sp)
                (const_int -24 [0xffffffffffffffe8])) [12 c+0 S16 A64])
        (reg:TI 0 ax [orig:110 c ] [110])) "t.C":3:49 98 {*movti_internal}
     (nil))
(note 6 5 9 2 NOTE_INSN_FUNCTION_BEG)
(insn 9 6 10 2 (set (reg:CCZ 17 flags)
        (compare:CCZ (mem/c:QI (plus:DI (reg/f:DI 7 sp)
                    (const_int -16 [0xfffffffffffffff0])) [4 MEM[(const struct
_Optional_base *)&c]._M_payload.D.14876._M_engaged+0 S1 A64])
            (const_int 0 [0]))) "t.C":4:5 discrim 1 9 {*cmpqi_ccno_1}
     (nil))

and only split2 splits the TImode operation to

(insn 29 16 30 4 (set (reg:DI 1 dx)
        (mem/c:DI (plus:DI (reg/f:DI 7 sp)
                (const_int -24 [0xffffffffffffffe8])) [12 c+0 S8 A64]))
"t.C":8:11 99 {*movdi_internal}
     (nil))     
(insn 30 29 21 4 (set (reg:DI 2 cx [+8 ])
        (mem/c:DI (plus:DI (reg/f:DI 7 sp)
                (const_int -16 [0xfffffffffffffff0])) [12 c+8 S8 A64]))
"t.C":8:11 99 {*movdi_internal}
     (nil)) 

which would likely have allowed CSE.  I'd have expected STV2, when not
converting TImode ops, to apply such splitting, but not leaving this
to after postreload-CSE.  On aarch64 there's load/store multiple to consider
and I've not looked at the RTL we have there.  This might in fact be
(tuned) target dependent behavior after all.

Reply via email to