On Wed, 27 Nov 2013, Jakub Jelinek wrote:

> On Wed, Nov 27, 2013 at 10:53:56AM +0100, Richard Biener wrote:
> > Hmm.  I'm still thinking that we should handle this during the regular
> > transform step.
> 
> I wonder if it can't be done instead just in vectorizable_load,
> if LOOP_REQUIRES_VERSIONING_FOR_ALIAS (loop_vinfo) and the load is
> invariant, just emit the (broadcasted) load not inside of the loop, but on
> the loop preheader edge.

It is safe even for !LOOP_REQUIRES_VERSIONING_FOR_ALIAS.  It's just
a missed optimization I even noted when originally implementing
support for invariant loads ...

> > *** gcc/tree-vect-data-refs.c       (revision 205435)
> > --- gcc/tree-vect-data-refs.c       (working copy)
> > *************** again:
> > *** 3668,3673 ****
> > --- 3668,3682 ----
> >         }
> >       STMT_VINFO_STRIDE_LOAD_P (stmt_info) = true;
> >     }
> > +       else if (loop_vinfo
> > +          && integer_zerop (DR_STEP (dr)))
> > +   {
> > +     /* All loads from a non-varying address will be disambiguated
> > +        by data-ref analysis or via a runtime alias check and thus
> > +        they will become invariant.  Force them to be vectorized
> > +        as external.  */
> > +     STMT_VINFO_DEF_TYPE (stmt_info) = vect_external_def;
> > +   }
> 
> I think this is unsafe for simd loops.
> I'd say:
> int a[1024], b[1024];
> 
> int foo (void)
> {
>   int i;
>   #pragma omp simd safelen(8)
>   for (i = 0; i < 1024; i++)
>     {
>       a[i] = i;
>       b[i] = a[0];
>     }
> }
> 
> is valid testcase, the loop behaves the same if you execute it
> sequentially, or vectorize using SIMD (hardware or emulated) instructions
> with vectorization factor of 2, 4 or 8, as long as you do all memory
> operations (either using scalar insns or simd instructions) in the order
> they were written, which I believe the vectorizer right now handles
> correctly, but the hoisting this patch wants to perform is not fine,
> unless data ref analysis would prove that it can't alias.  For non-simd
> loops we of course perform that data ref analysis and either version for
> alias, or prove that the drs can't alias, but for simd loops we take as
> given that the loop is safe to be vectorized.  It is, but not for hoisting.

Ick.  I hate this behind-the-back stuff - so safelen doesn't mean
that a[i] and a[0] do not alias.  Note that this will break with
SLP stuff at least as that will re-order reads/writes.  Not sure
how safelen applies to SLP though.  That is

    a[i] = i;
    b[i] = a[0];
    a[i+1] = i+1;
    b[i+1] = a[1];

will eventually end up re-ordering reads/writes in non-obvious
ways.

> So, the above say with emulated SIMD is safe to be executed as:
>   for (i = 0; i < 1024; i += 8)
>     {
>       int tmp;
>       for (tmp = i; tmp < i + 8; tmp++)
>       a[tmp] = tmp;
>       for (tmp = i; tmp < i + 8; tmp++)
>       b[tmp] = a[0];
>     }
> but not as:
>   int tempa[8], tmp;
>   /* Hoisted HW or emulated load + splat.  */
>   for (tmp = 0; tmp < 8; tmp++)
>     tempa[tmp] = a[0];
>   for (i = 0; i < 1024; i += 8)
>     {
>       for (tmp = i; tmp < i + 8; tmp++)
>       a[tmp] = tmp;
>       for (tmp = i; tmp < i + 8; tmp++)
>       b[tmp] = tempa[tmp];
>     }
> 
>       Jakub
> 
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer

Reply via email to