https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90483
--- Comment #4 from GCC Commits <cvs-commit at gcc dot gnu.org> --- The master branch has been updated by Roger Sayle <[email protected]>: https://gcc.gnu.org/g:a87cdfd2ca3260126d3c75ddfb5cdea6e721d8d0 commit r17-597-ga87cdfd2ca3260126d3c75ddfb5cdea6e721d8d0 Author: Roger Sayle <[email protected]> Date: Tue May 19 07:29:08 2026 -0400 i386: Optimize ptestz(x,-1) as ptestz(x,x) on x86 This patch, inspired by PR target/90483 and libstdc++/118416, implements some RTL expansion-time simplifications of ptest. A common idiom for testing a vector against zero is to use ptestz(mask,-1). Alas the code generated for this is suboptimal, requiring materialization of an all_ones vector. Given that ptestz(x,y) is defined as (x & y) == 0, an equivalent form is ptestz(mask,mask), saving an instruction (if ~0 isn't available). Consider the function: typedef long long v2di __attribute__ ((__vector_size__ (16))); int foo (v2di x) { return __builtin_ia32_ptestz128(x,~(v2di){0,0}); } with -O2 -mavx2, GCC currently generates: foo: vpcmpeqd %xmm1, %xmm1, %xmm1 xorl %eax, %eax vptest %xmm1, %xmm0 sete %al ret with this patch, it now generates: foo: xorl %eax, %eax vptest %xmm0, %xmm0 sete %al ret 2026-05-19 Roger Sayle <[email protected]> gcc/ChangeLog PR target/90483 PR libstdc++/118416 * config/i386/i386-expand.cc (ix86_expand_sse_ptest): Refactor with optimizations for PTESTZ*, PTESTC* and PTESTNZC*, including transforming ptestz(x,-1) into ptestz(x,x). gcc/testsuite/ChangeLog PR target/90483 PR libstdc++/118416 * gcc.target/i386/sse4_1-ptest-8.c: New test case. * gcc.target/i386/sse4_1-ptest-9.c: Likewise.
