On 15/01/2026 11:07, John Naylor wrote:
0003
s/fast/sse42/:
Seems okay in this file, but this isn't the best name, either. Maybe a
comment to head off future "corrections", something like:
"Technically, POPCNT is not part of SSE 4.2, and is not even a vector
operation, but many compilers emit the popcnt instruction with
-msse4.2 anyway."
s/slow/generic/:
I'm ambivalent about this. The "slow" designation is flat-out wrong
since at least Power and aarch64 can emit a single instruction here
without prodding the compiler. On the other hand, "generic" seems
wrong too, since e.g. pg_popcount64_slow() has three configure symbols
and two compiler builtins. :-D
"fallback", or "portable" ?
A possible future project would be to have a truly generic simple
fallback in pure C and put all the fancy stuff in the header for
architectures that have unconditional hardware support. It would make
more sense to revisit the name then.
Yeah, I noticed that on x86_64, pg_popcount_optimized is always a
function pointer with runtime check, even if you use compiler flags to
target a CPU where the special instructions are available unconditionally.
- Heikki