https://gcc.gnu.org/bugzilla/show_bug.cgi?id=123238
--- Comment #12 from GCC Commits <cvs-commit at gcc dot gnu.org> --- The master branch has been updated by Roger Sayle <[email protected]>: https://gcc.gnu.org/g:ae80ad655d514d7c275b21f5d1ad155793ae4cc0 commit r17-1923-gae80ad655d514d7c275b21f5d1ad155793ae4cc0 Author: Roger Sayle <[email protected]> Date: Fri Jun 26 16:20:05 2026 +0100 i386: ix86_expand_sse_movcc improvements This patch implements Alexander Monakov's suggestion from PR 123238. Traditionally, the x86_64 backend implements VCOND_MASK using a three instruction sequence of pand, pandn and por (requiring three registers), however when op_true and op_false are both constant vectors, this can be done using just two instructions, pand and pxor (requiring only two registers). This requires delaying forcing const_vector operands to memory (the constant pool) as late as possible, including changing the predicates on the define_expand patterns that call ix86_expand_sse_movcc to (consistently) accept vector_or_const_vector_operand. void f(char c[]) { for (int i = 0; i < 8; i++) c[i] = c[i] ? 'a' : 'c'; } Before with -O2 (12 instructions): f: movq (%rdi), %xmm0 pxor %xmm1, %xmm1 movabsq $7016996765293437281, %rdx // {'a','a','a'...} movabsq $7161677110969590627, %rax // {'c','c','c'...} movq %rdx, %xmm2 pcmpeqb %xmm1, %xmm0 movq %rax, %xmm1 pand %xmm0, %xmm1 pandn %xmm2, %xmm0 por %xmm1, %xmm0 movq %xmm0, (%rdi) ret After with -O2 (11 instructions): f: movq (%rdi), %xmm0 pxor %xmm1, %xmm1 movabsq $144680345676153346, %rdx // {2,2,2...} movabsq $7016996765293437281, %rax // {'a','a','a'...} pcmpeqb %xmm1, %xmm0 movq %rdx, %xmm1 pand %xmm1, %xmm0 movq %rax, %xmm1 pxor %xmm1, %xmm0 movq %xmm0, (%rdi) ret 2026-06-26 Roger Sayle <[email protected]> Hongtao Liu <[email protected]> gcc/ChangeLog PR target/123238 * config/i386/i386-expand.cc: Delay calling force_reg on op_true and op_false. Generate an AND then XOR sequence if op_true and op_false are both CONST_VECTOR_P. * config/i386/mmx.md (vcond_mask_<mode>v4hi): Allow operands 1 and 2 to be vector_or_const_vector_operand. (vcond_mask_<mode>v2hi): Likewise. (vcond_mask_<mode><mmxintvecmodelower>): Likewise. (vcond_mask_<mode><mode>): Likewise. * config/i386/sse.md (vcond_mask_<mode><sseintvecmodelower>): Likewise. (vcond_mask_<mode><sseintvecmodelower>): Likewise. (vcond_mask_v1tiv1ti): Likewise. (vcond_mask_<mode><sseintvecmodelower>): Likewise. (vcond_mask_<mode><sseintvecmodelower>): Likewise. * config/i386/predicates.md (vector_or_0_or_1s_operand): Delete predicate with no remaining uses. gcc/testsuite/ChangeLog PR target/123238 * gcc.target/i386/pr123238-2.c: New test case.
