On Wed, 16 Oct 2013, Tobias Burnus wrote:

> Frederic Riss wrote:
> > Just one question. You describe the pragma in the doco patch as:
> >
> > +This pragma tells the compiler that the immediately following @code{for}
> > +loop can be executed in any loop index order without affecting the result.
> > +The pragma aids optimization and in particular vectorization as the
> > +compiler can then assume a vectorization safelen of infinity.
> >
> > I'm not a specialist, but I was always told that the 'original'
> > meaning of ivdep (which I believe was introduced by Cray), was that
> > the compiler could assume that there are only forward dependencies in
> > the loop, but not that it can be executed in any order.
> 
> The nice thing about #pragma ivdep is that there is no real standard. And
> the explanation of the different vendors is also not completely clear.
> 
> 
> Some overview about this is given in the following file on pages 13-14 for
> Cray Reaseach PVP, MIPSPRO & Open64, Intel ICC, Multiflow
> http://sysrun.haifa.il.ibm.com/hrl/greps2007/papers/GREPS2007-Benoit.pdf
> 
> That's summerized as:
> - vector: ignore lexical upward dependencies (Cray PVP, Intel ICC)
> - parallel: ignore loop-carried dependencies (MIPSPRO, Open64)
> - liberal: ignore loop-variant dependencies (Multiflow)
> 
> 
> The quotes for Cray and Intel are below.
> 
> Cray: 
> http://docs.cray.com/books/004-2179-001/html-004-2179-001/brtlrwh.html#EKZ5MRWH
> "The ivdep directive tells the compiler to ignore vector dependencies for
>  the loop immediately following the directive. Conditions other than vector
>  dependencies can inhibit vectorization. If these conditions are satisfactory,
>  the loop vectorizes. This directive is useful for some loops that contain
>  pointers and indirect addressing. The format of this directive is as follows:
>  #pragma _CRI ivdep"

Which suggests we use

#pragma GCC ivdep

to not collide with eventually different semantics in existing programs
that use variants of this pragma?

> Intel: 
> http://software.intel.com/sites/products/documentation/doclib/iss/2013/compiler/cpp-lin/GUID-B25ABCC2-BE6F-4599-AEDF-2434F4676E1B.htm
> "The ivdep pragma instructs the compiler to ignore assumed vector 
> dependencies.
>  To ensure correct code, the compiler treats an assumed dependence as a proven
>  dependence, which prevents vectorization. This pragma overrides that 
> decision.
>  Use this pragma only when you know that the assumed loop dependencies are 
> safe
>  to ignore."

This suggests that _known_ dependences are still treated as dependences.
But what is known obviously depends on the implementation which
may not know that a[i] and a[i+1] depend but merely assume it.  Not
a standard-proof definition of the pragma ;)

That said, safelen even overrides know dependences (but with unknown
distance vector)! (that looks like a bug to me, or at least a QOI issue)

> 
> > The Intel docs give this example:
> ...
> > Given your description, this loop wouldn't be a candidate for ivdep,
> > as reversing the loop index order changes the semantics. I believe
> > that the way you interpret it (ie. setting vectorization safe length
> > to INT_MAX) is correct with respect to this other definition, though.
> 
> Do you have a suggestion for a better wording? My idea was to interpret
> this part similar to OpenMP's simd with safelen=infinity. (Actually, I
> believe loop->safelen was added for OpenMPv4's and/or Cilk Plus's "simd".)
> 
> OpenMPv4.0, http://www.openmp.org/mp-documents/OpenMP4.0.0.pdf , states
> for this (excerpt from page 70):
> "A SIMD loop has logical iterations numbered 0, 1,...,N-1 where N is the
> number of loop iterations, and the logical numbering denotes the sequence
> in which the iterations would be executed if the associated loop(s) were
> executed with no SIMD instructions. If the safelen clause is used then no
> two iterations executed concurrently with SIMD instructions can have a
> greater distance in the logical iteration space than its value. The
> parameter of the safelen clause must be a constant positive integer
> expression. The number of iterations that are executed concurrently at
> any given time is implementation defined. Each concurrent iteration will
> be executed by a different SIMD lane. Each set of concurrent iterations
> is a SIMD chunk."

OTOH, if we are mapping ivdep to safelen why not simply allow

#pragma GCC safelen 4

?

> 
> > Oh, and are there any plans to maintain this information in some way
> > till the back-end? Software pipelining could be another huge winner
> > for that kind of dependency analysis simplification.
> 
> I don't know until when loop->safelen is kept. As it is late in the
> middle-end, providing the backend with this information should be
> simple.

It's kept as long as we preserve loops which at the moment is until
after RTL loop optimizations are finished.  Extending this isn't
hard, I just didn't see a reason to do that.

Richard.

Reply via email to