https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68483
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Target| |i?86-*-* Status|UNCONFIRMED |NEW Last reconfirmed| |2015-11-23 Component|other |target Blocks| |53947 Target Milestone|--- |5.3 Summary|gcc 5.2: suboptimal code |[5/6 Regression] gcc 5.2: |compared to 4.9 |suboptimal code compared to | |4.9 Ever confirmed|0 |1 --- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> --- Hum, on x86_64 I don't see either GCC 4.9 or GCC 5.2 vectorize the function at all because they fail to analyze the evolution of the dataref for input[j] as the initial j of the inner loop is not propagated as zero. With i?86 I can confirm your observation but I don't see it fixed on trunk. Note that this boils down to vector shift detection of permutes where (IIRC) some patterns were not properly guarded on SSE3 support previously and a wrong-code bug was fixed conservatively on the GCC 5 branch while missing support was only implemented on trunk. The failure to vectorize on x86_64 isn't a regression. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 [Bug 53947] [meta-bug] vectorizer missed-optimizations