[Bug tree-optimization/49760] vectorization inhibited if indices are references
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49760 Richard Guenther rguenth at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |NEW Keywords||missed-optimization Last reconfirmed||2011.07.18 08:57:31 CC||rguenth at gcc dot gnu.org Ever Confirmed|0 |1 Severity|normal |enhancement --- Comment #1 from Richard Guenther rguenth at gcc dot gnu.org 2011-07-18 08:57:31 UTC --- The issue is that *k may be aliased by a store to the int array pointed to by out. That is because the middle-end treats int k the same as int * k (not sure if there is a semantic difference in C++), so it might as well point to out.a[27]. Which means int k should be restrict qualified as well (why even pass it as reference?)
[Bug tree-optimization/49760] vectorization inhibited if indices are references
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49760 --- Comment #2 from vincenzo Innocente vincenzo.innocente at cern dot ch 2011-07-18 09:39:49 UTC --- Thanks for the detailed explanation. In the real life code, out is filled calling foo multiple times (a sort of nested loop), k was used to track the current size of it. It is true that in this particular case I can just k+=N; at the end of the loop. The issue is more with for (int i=0; i!=in.size; ++i) as in foo2 because of not vectorized: number of iterations cannot be computed while copying it locally in N works. b.t.w. void foo(SoA const __restrict__ in, SoB __restrict__ out, int __restrict__ k) { int N=in.size; for (int i=0; i!=N; ++i) { out.b[k] = in.c[i]+in.b[i]; out.a[k] = in.a[i]; ++k; } } does not vectorize either
[Bug tree-optimization/49760] vectorization inhibited if indices are references
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49760 --- Comment #3 from rguenther at suse dot de rguenther at suse dot de 2011-07-18 09:44:23 UTC --- On Mon, 18 Jul 2011, vincenzo.innocente at cern dot ch wrote: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49760 --- Comment #2 from vincenzo Innocente vincenzo.innocente at cern dot ch 2011-07-18 09:39:49 UTC --- Thanks for the detailed explanation. In the real life code, out is filled calling foo multiple times (a sort of nested loop), k was used to track the current size of it. It is true that in this particular case I can just k+=N; at the end of the loop. The issue is more with for (int i=0; i!=in.size; ++i) as in foo2 because of not vectorized: number of iterations cannot be computed while copying it locally in N works. Probably the same issue - the out array may point to in.size and thus clobber it (yes - only if the loop runs exactly once, but we don't use that information yet). b.t.w. void foo(SoA const __restrict__ in, SoB __restrict__ out, int __restrict__ k) { int N=in.size; for (int i=0; i!=N; ++i) { out.b[k] = in.c[i]+in.b[i]; out.a[k] = in.a[i]; ++k; } } does not vectorize either I suppose we can't vectorize the k iterator (or rather we don't apply store motion). You should avoid using induction variables that live in memory.
[Bug tree-optimization/49760] vectorization inhibited if indices are references
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49760 --- Comment #4 from vincenzo Innocente vincenzo.innocente at cern dot ch 2011-07-18 10:07:10 UTC --- Fair enough. I think I can persuade developers to use only local variables as induction variable. More difficult will be to make them to copy also all other variables stored in memory (see PR49773).