https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110372

--- Comment #1 from Uroš Bizjak <ubizjak at gmail dot com> ---
Before reload, we have this sequence:

--cut here--
(insn 34 4 2 2 (set (reg:TI 119)
        (reg:TI 20 xmm0 [ u ])) "pr110372.c":8:1 89 {*movti_internal}
     (expr_list:REG_DEAD (reg:TI 20 xmm0 [ u ])
        (nil)))
(insn 2 34 3 2 (set (reg/v:TI 98 [ u ])
        (reg:TI 119)) "pr110372.c":8:1 89 {*movti_internal}
     (expr_list:REG_DEAD (reg:TI 119)
        (nil)))
(note 3 2 7 2 NOTE_INSN_FUNCTION_BEG)
(insn 7 3 9 2 (set (reg:V4SI 83 [ _2 ])
        (and:V4SI (subreg:V4SI (reg/v:TI 98 [ u ]) 0)
            (mem/u/c:V4SI (symbol_ref/u:DI ("*.LC0") [flags 0x2]) [0  S16
A128]))) "pr110372.c":9:14 6840 {*andv4si3}
     (expr_list:REG_EQUAL (and:V4SI (subreg:V4SI (reg/v:TI 98 [ u ]) 0)
            (const_vector:V4SI [
                    (const_int 4 [0x4]) repeated x4
                ]))
        (nil)))
--cut here--

And reload tries to move xmm0 to xmm1 with:

(insn 36 3 41 2 (set (reg:DI 0 ax [123])
        (reg:DI 20 xmm0 [orig:98 u ] [98])) "pr110372.c":9:14 90
{*movdi_internal}
     (nil))
(insn 41 36 37 2 (set (mem/c:TI (plus:DI (reg/f:DI 7 sp)
                (const_int -40 [0xffffffffffffffd8])) [2 %sfp+-32 S16 A128])
        (reg/v:TI 20 xmm0 [orig:98 u ] [98])) "pr110372.c":9:14 89
{*movti_internal}
     (nil))
(insn 37 41 43 2 (set (reg:DI 24 xmm4 [124])
        (mem/c:DI (plus:DI (reg/f:DI 7 sp)
                (const_int -32 [0xffffffffffffffe0])) [2 %sfp+-24 S8 A64]))
"pr110372.c":9:14 90 {*movdi_internal}
     (nil))
(insn 43 37 38 2 (set (reg:DI 23 xmm3 [122])
        (reg:DI 0 ax [123])) "pr110372.c":9:14 90 {*movdi_internal}
     (nil))
(insn 38 43 39 2 (set (reg:V2DI 23 xmm3 [122])
        (vec_concat:V2DI (reg:DI 23 xmm3 [122])
            (reg:DI 24 xmm4 [124]))) "pr110372.c":9:14 7265 {vec_concatv2di}
     (nil))
(insn 39 38 7 2 (set (reg:V4SI 21 xmm1 [orig:83 _2 ] [83])
        (reg:V4SI 23 xmm3 [122])) "pr110372.c":9:14 1869 {movv4si_internal}
     (nil))

in order to satisfy constraints of:

(insn 7 39 9 2 (set (reg:V4SI 21 xmm1 [orig:83 _2 ] [83])
        (and:V4SI (reg:V4SI 21 xmm1 [orig:83 _2 ] [83])
            (mem/u/c:V4SI (symbol_ref/u:DI ("*.LC0") [flags 0x2]) [0  S16
A128]))) "pr110372.c":9:14 6840 {*andv4si3}
     (expr_list:REG_EQUAL (and:V4SI (subreg:V4SI (reg/v:TI 20 xmm0 [orig:98 u ]
[98]) 0)
            (const_vector:V4SI [
                    (const_int 4 [0x4]) repeated x4
                ]))
        (nil)))

Please note that alternative 19 of *movdi_internal from i386.md (?r,?v) is
correctly enabled only for x64_sse2 ISA, so unavailable without SSE2.

We have:

#define VALID_SSE_REG_MODE(MODE)                                        \
  ((MODE) == V1TImode || (MODE) == TImode                               \
   || (MODE) == V4SFmode || (MODE) == V4SImode                          \
   || (MODE) == SFmode || (MODE) == SImode                              \
   || (MODE) == TFmode || (MODE) == TDmode)

So, TImode and V4SImode should be tieable for XMM registers and RA should just
copy the value between XMM registers.

Reply via email to