https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88972
Bug ID: 88972 Summary: popcnt of limited 128-bit number with unnecessary zeroing Product: gcc Version: 9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: drepper.fsp+rhbz at gmail dot com Target Milestone: --- Compile the following code on x86-64 with -Ofast -march=haswell: int f(__uint128_t m) { if (m < 64000) return __builtin_popcount(m); return -1; } The generated code with the trunk gcc looks like this: 0: b8 ff f9 00 00 mov $0xf9ff,%eax 5: 48 39 f8 cmp %rdi,%rax 8: b8 00 00 00 00 mov $0x0,%eax d: 48 19 f0 sbb %rsi,%rax 10: 72 0e jb 20 <f+0x20> 12: 31 c0 xor %eax,%eax 14: f3 0f b8 c7 popcnt %edi,%eax 18: c3 retq 19: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) 20: b8 ff ff ff ff mov $0xffffffff,%eax 25: c3 retq The instruction at offset 12 is unnecessary. I guess this is a left-over from the popcnt of the upper half which is recognized to be unnecessary and left out. There is no addition anymore but somehow the register clearing survived.