https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97738
--- Comment #2 from Thomas Koenig <tkoenig at gcc dot gnu.org> --- Created attachment 49516 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49516&action=edit Small benchmark Here's a small benchmark for counting all 32-bit numbers with 16 bits set according to the HAKMEM source. Timing is (first float is elapsed time in seconds for version with division, second float is for the shift): 2.319526 601080391 1.147284 601080391 with -O3 -march=native on an AMD Ryzen 7 1700X, 4.539288 601080391 2.700514 601080391 on POWER9.