On Thu, Nov 10, 2016 at 2:53 AM, Richard Biener <richard.guent...@gmail.com> wrote: > The biggest "lack" of loop distribution is the ability to undo CSE so for
I hadn't noticed this problem yet. I will have to take a look. > Then of course the cost model is purely modeled for STREAM (reduce the number > of memory streams). So loop distribution is expected to pessimize code for > the CSE case in case you are not memory bound and improve things if you > are memory bound. I noticed this problem. I think loop distribution should be callable from inside the vectorizer or vice versa. if a loop can't be vectorized, but distributing the loop allows the sub loops to be vectorized, then we should go ahead and ditsribute, even if that increases the number of memory streams slightly, as the gain from vectorizing should be greater than the loss from the additional memory streams. We could have a cost model that tries to compute the gain/loss here and make a better decision of when to distribute to increase vectorization at the expense of the number of memory streams. This looks like a major project though, and may be more work than I have time for. Jim