------- Comment #10 from roger at eyesopen dot com  2007-04-27 18:20 -------
Paul's fix looks correct to me.  It appears that when the "#if 0" was added
to disable broken loop shifting at some point in the distant past, the critical
functionality.

  if (nDepend)
    break;

was accidentally removed.  This is similar to the idiom used a few lines
earlier "nDepend = 1;  break;"


I would normally have suggested that the type of nDepend and the return type of
gfc_dep_resolver be changed to bool, and the dead "#if 0" code removed to
clean-up this code...

However, I think a much better longer term strategy is to completely remove
gfc_conv_resolve_dependencies, and gfc_dep_resolver and the automatic handling
of temporaries in the scalarizer.  Instead, in line with the TODO above
gfc_conv_resolve_dependencies, I think it's much better to handle this at a
higher level using the new/preferred gfc_check_dependency API that's used
through-out the front-end.

For example, when lhs=rhs has a dependency, this can be transformed into
"tmp=rhs; lhs=tmp" during lowering or at the front-end tree-level, which
greatly simplifies the complex code in the scalarizer.  It also allows the
post-assignment copy "lhs=tmp" to be performed efficiently [though I may have
already implemented that in the scalarizer already :-)].

One benefit of doing this at a higher level of abstraction is that
loop-reversal isn't a simple matter of running our default scalarized loop
backwards (as hinted by the comment, and the PRs in bugzilla).  Consider the
general case of a(s1:e1,s2:e2,s3:e3) = rhs.  The dependency analysis in
gfc_check_dependency may show that s1:e1 needs to run forwards to avoid a
conflict, and that s3:e3 needs to run backwards for the same reason.  It's
trivial then to treat this almost like a source-to-source transformation (ala
KAP) and convert the assignment into a(s1:e1:1,s2:e2,s3:e3:-1) instead of
attempting to shoehorn all this into the scalarizer at a low-level and perform
book-keeping on the direction vectors.  By the time, we've lowered gfc_expr to
gfc_ss, we've made things much harder for ourselves.  And things get much more
complex once we start seriously tackling CSHIFT, TRANSPOSE and friends!

Keeping the scalarizer simple also eases support for autovectorization and/or
moving it into the middle-end (the topic of Toon's GCC summit presentation).

I'm as surprised as Paul that this hasn't been a problem before.  I suspect
it's because we use the alternate gfc_check_dependency in the vast majority of
cases, and the empirical observation in the research literature that most f90
array assignments in real code don't carry a dependency.

Roger
--


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31711

Reply via email to