https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84278

            Bug ID: 84278
           Summary: claims initv4sfv2sf is available but inits through
                    stack
           Product: gcc
           Version: 8.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rguenth at gcc dot gnu.org
  Target Milestone: ---
            Target: x86_64-*-*, i?86-*-*

float A[1024];
float B[1024];

void foo(int s)
{
  for (int i = 0; i < 128; i++)
    {
      B[i*2+0] = A[i*s+0];
      B[i*2+1] = A[i*s+1];
    }
}

the vectorizer generates { v2sf, v2sf } for the strided load because the
backend tells it it can efficiently initialize such vector.  But we expand to
the following which doesn't look at all like using the special init path.

(insn 14 13 16 (set (reg:V4SF 97)
        (const_vector:V4SF [
                (const_double:SF 0.0 [0x0.0p+0])
                (const_double:SF 0.0 [0x0.0p+0])
                (const_double:SF 0.0 [0x0.0p+0])
                (const_double:SF 0.0 [0x0.0p+0])
            ])) "t.c":8 -1
     (nil))

(insn 16 14 17 (set (reg:DI 99)
        (subreg:DI (reg:V4SF 97) 0)) "t.c":8 -1
     (nil))

(insn 17 16 18 (parallel [
            (set (reg:DI 100)
                (and:DI (reg:DI 99)
                    (const_int 0 [0])))
            (clobber (reg:CC 17 flags))
        ]) "t.c":8 -1
     (nil))

(insn 18 17 19 (parallel [
            (set (reg:DI 101)
                (ior:DI (reg:DI 100)
                    (mem:DI (reg/f:DI 88 [ _5 ]) [1 MEM[base: _5, offset: 0B]+0
S8 A32])))
            (clobber (reg:CC 17 flags))
        ]) "t.c":8 -1
     (nil))

(insn 19 18 21 (set (subreg:DI (reg:V4SF 97) 0)
        (reg:DI 101)) "t.c":8 -1
     (nil))

(insn 21 19 22 (set (reg:DI 103)
        (subreg:DI (reg:V4SF 97) 8)) "t.c":8 -1
     (nil))

(insn 22 21 23 (parallel [
            (set (reg:DI 104)
                (and:DI (reg:DI 103)
                    (const_int 0 [0])))
            (clobber (reg:CC 17 flags))
        ]) "t.c":8 -1
     (nil))

(insn 23 22 24 (parallel [
            (set (reg:DI 105)
                (ior:DI (reg:DI 104)
                    (mem:DI (plus:DI (reg/f:DI 88 [ _5 ])
                            (reg:DI 93 [ _20 ])) [1 MEM[base: _5, index: _20,
offset: 0B]+0 S8 A32])))
            (clobber (reg:CC 17 flags))
        ]) "t.c":8 -1
     (nil))

(insn 24 23 25 (set (subreg:DI (reg:V4SF 97) 8)
        (reg:DI 105)) "t.c":8 -1
     (nil))


The issue seems to be the constructor element vector types have BLKmode
as seen by store_constructor.  The mismatch between what the vectorizer
checks and what expansion gets is that TYPE_MODE ends up calling
targetm.vector_mode_supported_p while the vectorizer just asks for
mode_for_vector (elmode, group_size).exists (&vmode).

Reply via email to