https://gcc.gnu.org/bugzilla/show_bug.cgi?id=33027
Andrew Pinski <pinskia at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |NEW Ever confirmed|0 |1 Last reconfirmed| |2021-07-19 Keywords| |missed-optimization --- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> --- Confirmed. -O2 produces much better code than -O3 or -O3 -fno-tree-vectorize. And we even optimize a slightly different case at -O2 where 1 is replaced with any other value: unsigned int fn(unsigned int n, unsigned int dmax) throw() { for (unsigned int d = 0; d < dmax; ++d) { n += d?d:55; } return n; } Note GCC seems to produce better code than LLVM for both cases even. Especially the -O3 with constant of 1, on aarch64, GCC produces an umax instruction while LLVM produces a cmeq/bsl pair :).