https://gcc.gnu.org/bugzilla/show_bug.cgi?id=125931
--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> --- The same vectorization also happens without -flto and the regression reproduces there as well. I'll note the profile shows no hit on the actual vector code but the containing loop is spread out so we possibly hit icache/branch predictor issues here. The code layout pessimization really hurts us here.
