https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103750
--- Comment #8 from Thiago Macieira <thiago at kde dot org> --- Update again: looks like the issue was the next line I didn't paste, which was performing _kortestz_mask32_u8 on an __mmask16. The type mismatch was causing this problem. If I Use the correct _kortestz_maskXX_u8, I'm getting: vmovdqu8 (%rsi), %ymm2 vmovdqu8 32(%rsi), %ymm3 vpcmpub $6, %ymm0, %ymm2, %k0 vpcmpub $6, %ymm0, %ymm3, %k1 kortestd %k1, %k0 je .L794 vmovdqu16 (%rsi), %ymm2 vmovdqu16 32(%rsi), %ymm3 vpcmpuw $6, %ymm0, %ymm2, %k0 vpcmpuw $6, %ymm0, %ymm3, %k1 kortestw %k1, %k0 je .L807 So it looks like GCC is not completely wrong, but it could be more lenient (Clang is). You can lower the severity of this issue.