https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114560
Bug ID: 114560 Summary: Compilation error when using _mm512_maskz_expandloadu_epi16 with only -mavx512vbmi2 Product: gcc Version: 11.4.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: meirav.grimberg at redis dot com Target Milestone: --- Hello, I'm using gcc 11.4. The problem also exits in gcc13. The following code fails to compile: #include <immintrin.h> int main(void) { unsigned short vec[16]; for (size_t i =0; i < 16; i++) { vec[i] = 2; } __mmask32 mask = 0xAAAAAAAA; __m512i bf16_to_fp32 = _mm512_maskz_expandloadu_epi16(mask, vec); return 0; } ``` g++ test.cpp -o test -mavx512vbmi2; In file included from /usr/lib/gcc/x86_64-linux-gnu/11/include/immintrin.h:81, from test.cpp:1: /usr/lib/gcc/x86_64-linux-gnu/11/include/avx512vbmi2intrin.h: In function ‘int main()’: /usr/lib/gcc/x86_64-linux-gnu/11/include/avx512vbmi2intrin.h:451:1: error: inlining failed in call to ‘always_inline’ ‘__m512i _mm512_maskz_expandloadu_epi16(__mmask32, const void*)’: target specific option mismatch 451 | _mm512_maskz_expandloadu_epi16 (__mmask32 __A, const void * __B) | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ test.cpp:9:58: note: called from here 9 | __m512i bf16_to_fp32 = _mm512_maskz_expandloadu_epi16(mask, vec); According to Intel® Intrinsics Guide, only avx512vbmi2 flag is required to use _mm512_maskz_expandloadu_epi16. However, when I add -mavx512bw flag to the compilation command, it works as expected with no errors. i notice that indeed, in avx512vbmi2intrin.h this function is located within the section that requires both flags: #if !defined(__AVX512VBMI2__) || !defined(__AVX512BW__) #pragma GCC push_options #pragma GCC target("avx512vbmi2,avx512bw") #define __DISABLE_AVX512VBMI2BW__ #endif /* __AVX512VBMI2BW__ */ ... extern __inline __m512i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm512_maskz_expand_epi16 (__mmask32 __A, __m512i __B) { return (__m512i) __builtin_ia32_expandhi512_maskz ((__v32hi) __B, (__v32hi) _mm512_setzero_si512 (), (__mmask32) __A); } ... #ifdef __DISABLE_AVX512VBMI2BW__ #undef __DISABLE_AVX512VBMI2BW__ #pragma GCC pop_options #endif /* __DISABLE_AVX512VBMI2BW__ */ In addition, i tried to compile this code with clang14 and intel c++ compiler, using only the -mavx512vbmi2 flag, and both succeeded. Thank you.