https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92295

            Bug ID: 92295
           Summary: Inefficient vector constructor
           Product: gcc
           Version: 10.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: hjl.tools at gmail dot com
                CC: crazylht at gmail dot com
  Target Milestone: ---
            Target: x86-64

[hjl@gnu-skx-1 microbenchmark]$ cat dup.c
typedef int X __attribute__((vector_size (32)));

X
foo (int x, int z)
{
  X y = { x, x, x, x, z, z, z, z };
  return y;
}

[hjl@gnu-skx-1 microbenchmark]$ gcc -S -O2 -march=skylake-avx512 dup.c
[hjl@gnu-skx-1 microbenchmark]$ cat dup.s
        .file   "dup.c"
        .text
        .p2align 4
        .globl  foo
        .type   foo, @function
foo:
.LFB0:
        .cfi_startproc
        vmovd   %esi, %xmm2
        vmovd   %edi, %xmm3
        vpinsrd $1, %esi, %xmm2, %xmm1
        vpinsrd $1, %edi, %xmm3, %xmm0
        vpunpcklqdq     %xmm1, %xmm1, %xmm1
        vpunpcklqdq     %xmm0, %xmm0, %xmm0
        vinserti128     $0x1, %xmm1, %ymm0, %ymm0
        ret
        .cfi_endproc
.LFE0:
        .size   foo, .-foo
        .ident  "GCC: (GNU) 9.2.1 20190827 (Red Hat 9.2.1-1)"
        .section        .note.GNU-stack,"",@progbits
[hjl@gnu-skx-1 microbenchmark]$ 

We can generate:

        vpbroadcastd    %edi, %xmm0
        vpbroadcastd    %esi, %xmm1
        vinserti128     $1, %xmm1, %ymm0, %ymm0
        retq

Reply via email to