On 20/07/15 20:31, Sebastian Pop wrote:
Tom de Vries wrote:
So I wondered, why not always use the graphite dependency analysis
in parloops. (Of course you could use -floop-parallelize-all, but
that also changes the heuristic). So I wrote a patch for parloops to
use graphite dependency analysis by default (so without
-floop-parallelize-all), but while testing found out that all the
reduction test-cases started failing because the modifications
graphite makes to the code messes up the parloops reduction
analysis.

Then I came up with this patch, which:
- first runs a parloops pass, restricted to reduction loops only,

I would prefer to fix graphite to catch the reduction loop and avoid running an
extra pass before graphite for that case.

Can you please specify which file is
failing to be parallelized?  Are they all those testcases that you update the 
flags?

Yep, f.i. autopar/reduc-1.c.

Also it seems to me that you are missing -ffast-math to parallelize all these
loops: without that flag graphite would not mark reductions as
associative/commutative operations and they would not be recognized as parallel.

For an unsigned int reduction, we need don't need -ffast-math, so we don't have to specify it for parloops. It seems graphite is too strict in that, since it won't do any reductions without -fassociate-math.

But indeed, with -ffast-math -ftree-parallelize-loops=2 -floop-parallelize-all we are able to parallelize the 3 reduction loops in autopar/reduc-1.c

Is that something the current parloops detection is not too strict about?

Parloops uses vect_is_simple_reduction_1, which has some extensive testing to see if reordering of operations is allowed. The testing of graphite seems to be limited to testing fassociative-math, which makes me suspect that tests are missing there, f.i. TYPE_OVERFLOW_TRAPS.

Thanks,
- Tom

Reply via email to