https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68714
--- Comment #7 from Marc Glisse <glisse at gcc dot gnu.org> --- I find it strange that we do all operations on masks and not on "booleans" for vectors. typedef int T; T f(T a,T b,T c,T d){ return (a<b)&(c<d); } we generate: _Bool _3; _Bool _6; _Bool _7; T _8; <bb 2>: _3 = a_1(D) < b_2(D); _6 = c_4(D) < d_5(D); _7 = _3 & _6; _8 = (T) _7; return _8; that is, we are happy to do the bit_and on booleans. However, with typedef int T __attribute__((vector_size(64))); we now generate (-mavx512f): _3 = VEC_COND_EXPR <a_1(D) < b_2(D), { -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1 }, { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }>; _6 = VEC_COND_EXPR <c_4(D) < d_5(D), { -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1 }, { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }>; _7 = _3 & _6; return _7; yielding this code: vpcmpgtd %zmm0, %zmm1, %k1 vpternlogd $0xFF, %zmm4, %zmm4, %zmm4 vmovdqa32 %zmm4, %zmm0{%k1}{z} vpcmpgtd %zmm2, %zmm3, %k1 vmovdqa32 %zmm4, %zmm2{%k1}{z} vpandd %zmm2, %zmm0, %zmm0 We perform the bit_and on the mask type, whereas it would be better to do it on the boolean type and use 'kandw'. For most platforms, (vec_cnd x -1 0) should be a NOP so it doesn't really matter, and for the few remaining (AVX512 and Sparc IIRC) we want to use "booleans" as much as possible and only convert to a mask late. I think that implies that we should pull operations on masks into operations on booleans (as in the original patch in comment #1 maybe, plus canonicalizing (vec_cnd x 0 -1)), and probably that forwarding conditions into the first argument of vec_cond should only be done late (around expand). But it is quite possible that my intuition is completely bogus here.