https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112566
Bug ID: 112566 Summary: Some ctz/popcount/parity/ffs optimizations Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: jakub at gcc dot gnu.org Target Milestone: --- I believe ctz(ext(x)) == ctz(x) if UB on zero, popcount(zext(x)) == popcount(x), parity(zext(x)) == parity(x), parity(sext(x)) == parity(x) if the extension is by even number of bits and ffs(ext(x)) == ffs(x). So, e.g. with x86 -O2 -m32 -mbmi2 -mlzcnt -mpopcnt int foo (unsigned int x) { return __builtin_ctzll (x); } int bar (unsigned int x) { return __builtin_popcountll (x); } int baz (unsigned int x) { return __builtin_parityll (x); } int qux (int x) { return __builtin_ffsll (x); } int corge (int x) { return __builtin_ctzll (x); } int garply (int x) { return __builtin_parityll (x); } int fred (unsigned int x) { return __builtin_ffsll (x); } shouldn't use any double-word bit query, or similarly int foo (unsigned _BitInt(256) x) { return __builtin_ctzg ((unsigned _BitInt(512)) x); } int bar (unsigned _BitInt(256) x) { return __builtin_popcountg ((unsigned _BitInt(512)) x); } int baz (unsigned _BitInt(256) x) { return __builtin_parityg ((unsigned _BitInt(512)) x); } int qux (_BitInt(256) x) { return __builtin_ffsg ((_BitInt(512)) x); } int corge (_BitInt(256) x) { return __builtin_ctzg ((unsigned _BitInt(512)) x); } int garply (_BitInt(256) x) { return __builtin_parityg ((unsigned _BitInt(512)) x); } int fred (unsigned _BitInt(256) x) { return __builtin_ffsg ((_BitInt(512)) x); } Of course, we shouldn't do this if we deoptimize some supported precision into an unsupported narrower one. For clz(zext(x)) = clz(x)+difference_in_precision, but at that point it might not be a win.