https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100089
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Ever confirmed|0 |1 Last reconfirmed| |2021-04-15 Status|UNCONFIRMED |NEW Summary|[11 Performance regression |[11 Regression] 30% |] 30% for |performance regression for |denbench/mp2decoddata2 with |denbench/mp2decoddata2 with |-O3 |-O3 Target Milestone|--- |11.0 --- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> --- Indeed loop vectorization throws if-converted bodies at the BB vectorizer as a last resort (because BB vectorization doesn't do if-conversion itself). But the BB vectorizer then uses the if-converted scalar code as the thing to cost against (costing against the not if-converted loop body isn't really possible). To quote /* If we applied if-conversion then try to vectorize the BB of innermost loops. ??? Ideally BB vectorization would learn to vectorize control flow by applying if-conversion on-the-fly, the following retains the if-converted loop body even when only non-if-converted parts took part in BB vectorization. */ if (flag_tree_slp_vectorize != 0 && loop_vectorized_call && ! loop->inner) { as a "hack" we could see to scalar cost the always executed part of the not if-converted loop body and apply the full bias of this cost vs. the scalar cost of the if-converted body to the scalar cost of the BB vectorization. But that's really apples-to-oranges in the end (as it is now). Maybe we can cost the whole partly vectorized loop body in this mode and compare it against the scalar cost of the original loop. But even the loop vectorizer costs the if-converted scalar loop, so it is off as well. Long-term if-conversion needs to be integrated with vectorization so we can at least keep track of what stmts were originally executed conditional and what not. Short-term I'm not sure we can do much. Doing SLP on the if-converted body does help in quite some cases.