On Sat, Feb 21, 2026 at 4:33 AM Nathan Bossart <[email protected]> wrote:
>
> On Fri, Feb 20, 2026 at 09:39:38AM -0600, Nathan Bossart wrote:
> > I spent some time looking at how clang/gcc compiled the plain-C version on
> > various architectures [0], and I was pleasantly surprised to discover that
> > at some point in recent history they started automatically converting it to
> > special popcount instructions. I suspect that you'd see better results on
> > ppc64le if you upgraded the compiler...
Interesting! Yeah, I deliberately sought a trailing edge system for that test.
> If we're willing to rely on this behavior, we could even remove
> pg_popcount64_neon() and pg_popcount64_sse42(). We still need to add
> "pg_attribute_target("popcnt")" and a corresponding configure check for the
> x86 stuff, so it's not as impressive from a code-removal standpoint, but it
> at least allows us to remove some uses of intrinsics and inline assembly.
I'm not really a fan of replacing one x86 configure check with
another. Also, since the __asm now lives in an x86-only file, and
since C11 puts a floor on what compilers we support, I wonder if we
actually need a configure check at all anymore -- my thought is we can
just guard the inline assembly with __GNUC__. Then that whole file can
just guard on USE_SSE2. If someone tried to compile with something not
gcc/clang/MSVC, we'd still need a fallback in that file, but that's
trivial.
As for Arm, clang 17 is pretty new, and compared to the rest of that
file, removing one line of intrinsics is straining out a gnat while
swallowing a camel. ;-)
In short, I'm in favor of v14, along with the comment about newer
compilers from v15.
--
John Naylor
Amazon Web Services