https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114449
--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> --- Note we do unroll the loop with -O3 but only late after which we do not re-do bswap recognition (which happens before loop optimization). At -O2 we don't unroll because that increases code-size too much. Recognition of "final value computation" is done in the sccp pass which could be amended for this (final_value_replacement_loop, tree-scalar-evolution.cc).