[Bug tree-optimization/49760] vectorization inhibited if indices are references

2011-07-18 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49760

Richard Guenther rguenth at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Keywords||missed-optimization
   Last reconfirmed||2011.07.18 08:57:31
 CC||rguenth at gcc dot gnu.org
 Ever Confirmed|0   |1
   Severity|normal  |enhancement

--- Comment #1 from Richard Guenther rguenth at gcc dot gnu.org 2011-07-18 
08:57:31 UTC ---
The issue is that *k may be aliased by a store to the int array pointed to
by out.  That is because the middle-end treats int  k the same as
int * k (not sure if there is a semantic difference in C++), so it might
as well point to out.a[27].  Which means int  k should be restrict
qualified as well (why even pass it as reference?)


[Bug tree-optimization/49760] vectorization inhibited if indices are references

2011-07-18 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49760

--- Comment #2 from vincenzo Innocente vincenzo.innocente at cern dot ch 
2011-07-18 09:39:49 UTC ---
Thanks for the detailed explanation.
In the real life code, out is filled  calling foo multiple times (a sort of
nested loop), k was used to track the current size of it.
It is true that in this particular case  I can just  k+=N; at the end of the
loop.

The issue is more with 
for (int i=0; i!=in.size; ++i) as in foo2 
because of
not vectorized: number of iterations cannot be computed
while copying it locally in N works.


b.t.w.
void foo(SoA const  __restrict__ in, SoB  __restrict__ out, int 
__restrict__ k) {
  int N=in.size;
  for (int i=0; i!=N; ++i) {
out.b[k] = in.c[i]+in.b[i];
out.a[k] = in.a[i];
++k;
  }
}
does not vectorize either


[Bug tree-optimization/49760] vectorization inhibited if indices are references

2011-07-18 Thread rguenther at suse dot de
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49760

--- Comment #3 from rguenther at suse dot de rguenther at suse dot de 
2011-07-18 09:44:23 UTC ---
On Mon, 18 Jul 2011, vincenzo.innocente at cern dot ch wrote:

 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49760
 
 --- Comment #2 from vincenzo Innocente vincenzo.innocente at cern dot ch 
 2011-07-18 09:39:49 UTC ---
 Thanks for the detailed explanation.
 In the real life code, out is filled  calling foo multiple times (a sort 
 of
 nested loop), k was used to track the current size of it.
 It is true that in this particular case  I can just  k+=N; at the end of the
 loop.
 
 The issue is more with 
 for (int i=0; i!=in.size; ++i) as in foo2 
 because of
 not vectorized: number of iterations cannot be computed
 while copying it locally in N works.

Probably the same issue - the out array may point to in.size and
thus clobber it (yes - only if the loop runs exactly once, but we
don't use that information yet).

 
 b.t.w.
 void foo(SoA const  __restrict__ in, SoB  __restrict__ out, int 
 __restrict__ k) {
   int N=in.size;
   for (int i=0; i!=N; ++i) {
 out.b[k] = in.c[i]+in.b[i];
 out.a[k] = in.a[i];
 ++k;
   }
 }
 does not vectorize either

I suppose we can't vectorize the k iterator (or rather we don't
apply store motion).  You should avoid using induction variables
that live in memory.


[Bug tree-optimization/49760] vectorization inhibited if indices are references

2011-07-18 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49760

--- Comment #4 from vincenzo Innocente vincenzo.innocente at cern dot ch 
2011-07-18 10:07:10 UTC ---
Fair enough.
I think I can persuade developers to use only local variables as induction
variable.
More difficult will be to make them to copy also all other variables stored in
memory (see PR49773).