https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102583

            Bug ID: 102583
           Summary: [x86] Failure to optimize 32-byte integer vector
                    conversion to 16-byte float vector properly when
                    converting upper part with -mavx2
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: gabravier at gmail dot com
  Target Milestone: ---

typedef int v8si __attribute__((vector_size(32)));
typedef float v4sf __attribute__((vector_size(16)));

v4sf high (v8si *srcp)
{
  v8si src = *srcp;
  return (v4sf) { (float)src[4], (float)src[5], (float)src[6], (float)src[7] };
}

With -O3 -mavx2, GCC outputs this:

high(int __vector(8)*):
        vmovdqa ymm0, YMMWORD PTR [rdi]
        vperm2i128      ymm0, ymm0, ymm0, 17
        vcvtdq2ps       xmm0, xmm0
        vzeroupper
        ret

LLVM instead outputs this:

high(int __vector(8)*):
        vcvtdq2ps       xmm0, xmmword ptr [rdi + 16]
        ret

And GCC outputs the equivalent code if -mavx2 is removed:

high(int __vector(8)*):
        cvtdq2ps        xmm0, XMMWORD PTR [rdi+16]
        ret

Reply via email to