Hi Richard You mentioned that Micha has a patch pending that enables of zero-step stores. What is the status of this patch? I could not find it through searching "Micha".
Thank you! Cong On Wed, Oct 16, 2013 at 2:02 AM, Richard Biener <rguent...@suse.de> wrote: > On Tue, 15 Oct 2013, Cong Hou wrote: > >> Thank you for your reminder, Jeff! I just noticed Richard's comment. I >> have modified the patch according to that. >> >> The new patch is attached. > > (posting patches inline is easier for review, now you have to deal > with no quoting markers ;)) > > Comments inline. > > diff --git a/gcc/ChangeLog b/gcc/ChangeLog > index 8a38316..2637309 100644 > --- a/gcc/ChangeLog > +++ b/gcc/ChangeLog > @@ -1,3 +1,8 @@ > +2013-10-15 Cong Hou <co...@google.com> > + > + * tree-vect-loop-manip.c (vect_loop_versioning): Hoist loop invariant > + statement that contains data refs with zero-step. > + > 2013-10-14 David Malcolm <dmalc...@redhat.com> > > * dumpfile.h (gcc::dump_manager): New class, to hold state > diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog > index 075d071..9d0f4a5 100644 > --- a/gcc/testsuite/ChangeLog > +++ b/gcc/testsuite/ChangeLog > @@ -1,3 +1,7 @@ > +2013-10-15 Cong Hou <co...@google.com> > + > + * gcc.dg/vect/pr58508.c: New test. > + > 2013-10-14 Tobias Burnus <bur...@net-b.de> > > PR fortran/58658 > diff --git a/gcc/testsuite/gcc.dg/vect/pr58508.c > b/gcc/testsuite/gcc.dg/vect/pr58508.c > new file mode 100644 > index 0000000..cb22b50 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/vect/pr58508.c > @@ -0,0 +1,20 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-details" } */ > + > + > +/* The GCC vectorizer generates loop versioning for the following loop > + since there may exist aliasing between A and B. The predicate checks > + if A may alias with B across all iterations. Then for the loop in > + the true body, we can assert that *B is a loop invariant so that > + we can hoist the load of *B before the loop body. */ > + > +void foo (int* a, int* b) > +{ > + int i; > + for (i = 0; i < 100000; ++i) > + a[i] = *b + 1; > +} > + > + > +/* { dg-final { scan-tree-dump-times "hoist" 2 "vect" } } */ > +/* { dg-final { cleanup-tree-dump "vect" } } */ > diff --git a/gcc/tree-vect-loop-manip.c b/gcc/tree-vect-loop-manip.c > index 574446a..f4fdec2 100644 > --- a/gcc/tree-vect-loop-manip.c > +++ b/gcc/tree-vect-loop-manip.c > @@ -2477,6 +2477,92 @@ vect_loop_versioning (loop_vec_info loop_vinfo, > adjust_phi_and_debug_stmts (orig_phi, e, PHI_RESULT (new_phi)); > } > > > Note that applying this kind of transform at this point invalidates > some of the earlier analysis the vectorizer performed (namely the > def-kind which now effectively gets vect_external_def from > vect_internal_def). In this case it doesn't seem to cause any > issues (we re-compute the def-kind everytime we need it (how wasteful)). > > + /* Extract load and store statements on pointers with zero-stride > + accesses. */ > + if (LOOP_REQUIRES_VERSIONING_FOR_ALIAS (loop_vinfo)) > + { > + /* In the loop body, we iterate each statement to check if it is a load > + or store. Then we check the DR_STEP of the data reference. If > + DR_STEP is zero, then we will hoist the load statement to the loop > + preheader, and move the store statement to the loop exit. */ > > We don't move the store yet. Micha has a patch pending that enables > vectorization of zero-step stores. > > + for (gimple_stmt_iterator si = gsi_start_bb (loop->header); > + !gsi_end_p (si);) > > While technically ok now (vectorized loops contain a single basic block) > please use LOOP_VINFO_BBS () to get at the vector of basic-blcoks > and iterate over them like other code does. > > + { > + gimple stmt = gsi_stmt (si); > + stmt_vec_info stmt_info = vinfo_for_stmt (stmt); > + struct data_reference *dr = STMT_VINFO_DATA_REF (stmt_info); > + > + if (dr && integer_zerop (DR_STEP (dr))) > + { > + if (DR_IS_READ (dr)) > + { > + if (dump_enabled_p ()) > + { > + dump_printf_loc > + (MSG_NOTE, vect_location, > + "hoist the statement to outside of the loop "); > > "hoisting out of the vectorized loop: " > > + dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt, 0); > + dump_printf (MSG_NOTE, "\n"); > + } > + > + gsi_remove (&si, false); > + gsi_insert_on_edge_immediate (loop_preheader_edge (loop), > stmt); > > Note that this will result in a bogus VUSE on the stmt at this point which > will be only fixed because of implementation details of loop versioning. > Either get the correct VUSE from the loop header virtual PHI node > preheader edge (if there is none then the current VUSE is the correct one > to use) or clear it. > > + } > + /* TODO: We also consider vectorizing loops containing zero-step > + data refs as writes. For example: > + > + int a[N], *s; > + for (i = 0; i < N; i++) > + *s += a[i]; > + > + In this case the write to *s can be also moved after the > + loop. */ > > Note that you then invalidate even more vectorizer analysis - you > basically introduce a scalar reduction (you have to add a PHI node). > Which means that the transform has to happen elsewhere. > > As Jakub now tries with if-conversion this would also be a candidate > for applying the loop versioning before even starting the rest of the > vectorizer analysis code. > > That said, I'd remove the TODO at this point because it's clearly not > possible to implement just here ;) > > + continue; > + } > + else if (!dr) > + { > + bool hoist = true; > + for (size_t i = 0; i < gimple_num_ops (stmt); i++) > > You are checking all kinds of statements, including assignments > of which you are also checking the LHS ... restricting to > assignments you can start walking at i = 1. > > + { > + tree op = gimple_op (stmt, i); > + if (TREE_CODE (op) == INTEGER_CST > + || TREE_CODE (op) == REAL_CST) > > There are other constants as well - just check > > if (is_gimple_min_invariant (op)) > > + continue; > + if (TREE_CODE (op) == SSA_NAME) > + { > + gimple def = SSA_NAME_DEF_STMT (op); > + if (def == stmt > > with starting at op 1 you can drop this. > > + || gimple_nop_p (def) > + || !flow_bb_inside_loop_p (loop, gimple_bb (def))) > + continue; > + } > > Note that you fail to hoist if-converted code this way because > op1 of > > name_1 = a_2 < b_4 ? x_5 : y_6; > > is 'a_2 < b_4', a tree expression and not an SSA name (ugh). Maybe > we don't care (it's just a missed optimization), but if you are > restricting yourself to hoisting assignments without a data-ref > then you can walk over SSA uses on the stmt (instead of over gimple > ops) with > > FOR_EACH_SSA_TREE_OPERAND (op, stmt, iter, SSA_USE) > > which would automagically take care of that case. > > Can you add a testcase which involves invariant if-conversion? > Can you add a testcase with just an invariant store to make sure > we don't wreck it? Can you add a testcase with an invariant store > and load (the reduction case), too? > > + hoist = false; > + break; > + } > > For example you'll hoist all labels this way (no ops), as well as the > loop exit GIMPLE_COND in case it's operands were loop invariant, > breaking the CFG ... so please add && is_gimple_assign () to the > if (!dr) check ;) > > + if (hoist) > + { > + gsi_remove (&si, false); > + gsi_insert_on_edge_immediate (loop_preheader_edge (loop), > stmt); > + > + if (dump_enabled_p ()) > + { > + dump_printf_loc > + (MSG_NOTE, vect_location, > + "hoist the statement to outside of the loop "); > + dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt, 0); > + dump_printf (MSG_NOTE, "\n"); > + } > + continue; > + } > + } > + gsi_next (&si); > + } > + } > + > /* End loop-exit-fixes after versioning. */ > > if (cond_expr_stmt_list) > >> >> thanks, >> Cong >> >> >> On Tue, Oct 15, 2013 at 12:33 PM, Jeff Law <l...@redhat.com> wrote: >> > On 10/14/13 17:31, Cong Hou wrote: >> >> >> >> Any comment on this patch? >> > >> > Richi replied in the BZ you opened. >> > >> > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58508 >> > >> > Essentially he said emit the load on the edge rather than in the block >> > itself. >> > jeff >> >