https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114560

            Bug ID: 114560
           Summary: Compilation error when using
                    _mm512_maskz_expandloadu_epi16 with only -mavx512vbmi2
           Product: gcc
           Version: 11.4.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: meirav.grimberg at redis dot com
  Target Milestone: ---

Hello,
I'm using gcc 11.4. The problem also exits in gcc13.

The following code fails to compile:

#include <immintrin.h>

int main(void) {
    unsigned short vec[16];
    for (size_t i =0; i < 16; i++) {
        vec[i] = 2;
    }
    __mmask32 mask = 0xAAAAAAAA;
    __m512i bf16_to_fp32 = _mm512_maskz_expandloadu_epi16(mask, vec);
    return 0;
}
```

g++ test.cpp -o test -mavx512vbmi2; 

In file included from /usr/lib/gcc/x86_64-linux-gnu/11/include/immintrin.h:81,
                 from test.cpp:1:
/usr/lib/gcc/x86_64-linux-gnu/11/include/avx512vbmi2intrin.h: In function ‘int
main()’:
/usr/lib/gcc/x86_64-linux-gnu/11/include/avx512vbmi2intrin.h:451:1: error:
inlining failed in call to ‘always_inline’ ‘__m512i
_mm512_maskz_expandloadu_epi16(__mmask32, const void*)’: target specific option
mismatch
  451 | _mm512_maskz_expandloadu_epi16 (__mmask32 __A, const void * __B)
      | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
test.cpp:9:58: note: called from here
    9 |     __m512i bf16_to_fp32 = _mm512_maskz_expandloadu_epi16(mask, vec);

According to Intel® Intrinsics Guide, only avx512vbmi2 flag is required to use
_mm512_maskz_expandloadu_epi16. 

However, when I add -mavx512bw flag to the compilation command, it works as
expected with no errors.
i notice that indeed, in avx512vbmi2intrin.h this function is located within
the section that requires both flags:

#if !defined(__AVX512VBMI2__) || !defined(__AVX512BW__)
#pragma GCC push_options
#pragma GCC target("avx512vbmi2,avx512bw")
#define __DISABLE_AVX512VBMI2BW__
#endif /* __AVX512VBMI2BW__ */

...

extern __inline __m512i
__attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm512_maskz_expand_epi16 (__mmask32 __A, __m512i __B)
{
  return (__m512i) __builtin_ia32_expandhi512_maskz ((__v32hi) __B,
                        (__v32hi) _mm512_setzero_si512 (), (__mmask32) __A);
}

...
#ifdef __DISABLE_AVX512VBMI2BW__
#undef __DISABLE_AVX512VBMI2BW__

#pragma GCC pop_options
#endif /* __DISABLE_AVX512VBMI2BW__ */


In addition, i tried to compile this code with clang14 and intel c++ compiler,
using only the -mavx512vbmi2 flag, and both succeeded.

Thank you.

Reply via email to