https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95018
--- Comment #32 from Richard Biener <rguenth at gcc dot gnu.org> --- Note I don't think the unrolling is excessive - store motion then applying to all count[] and all computations hoisted out of the loop may be a bit too much for register pressure though, especially since we're using flag-based store-motion. But it causes the stores to be materialized on all exits of the loop which means we end up with N*N conditional stores :/ I guess SM could be improved here.