Sample code: #include <stdio.h>
int main (void) { long long a = 0xFFFFFFFFFFFFLL; // 48 bits set int popcount; #if 1 popcount = __builtin_popcountll (a); #else popcount = __popcountdi2 (a); #endif printf ("%llx popcount = %d\n", a, popcount); return 0; } If -mpopcnt is enabled, this code only outputs the correct value (48) when -O3 is on (apparently it's calculating it at compile time). Without optimizations, it is apparently only counting the bits in the lower dword of the long long variable. OTOH, If __popcountdi2 is used, it works correctly (but according to the assembly code it's not really using the popcnt instruction which means it's much slower). Test runs and output follow (note you need a CPU which supports the popcnt instruction): => gcc popcnt.c -o popcnt -Wall -O0 -mpopcnt && ./popcnt ffffffffffff popcount = 32 => gcc popcnt.c -o popcnt -Wall -O3 -mpopcnt && ./popcnt ffffffffffff popcount = 48 => gcc popcnt.c -o popcnt -Wall -O0 && ./popcnt ffffffffffff popcount = 48 => gcc popcnt.c -o popcnt -Wall -O3 && ./popcnt ffffffffffff popcount = 48 -- Summary: __builtin_popcountll fails with -O0 and -mpopcnt Product: gcc Version: 4.1.2 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: rbarreira at gmail dot com GCC build triplet: i586-suse-linux GCC host triplet: i586-suse-linux GCC target triplet: i586-suse-linux http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43406