On Wed, 30 Mar 2016, Jan Hubicka wrote: > Hi, > while looking into sudoku solving benchark, I noticed that we incorrectly > estimate loop to iterate 10 times just because the array it traverses is of > dimension 10. This of course is just upper bound and not realistic bound. > Realistically those loops iterates once most of time. > > It turns out this bug was introduced by me in > https://gcc.gnu.org/ml/gcc-patches/2013-01/msg00444.html > While I do not recall doing that patch, it seems like a thinko about reliable > (name of the variable) and realistic (name of the parameter it is passed to). > > Fixing this caused one testsuite fallout in predictive commoning testcase > because loop unswitching previously disabled itself having an estimated number > of iterations 1 (I am not sure if that is not supposed to be 0, with 1 > iteration unswithcing may pay back, little bit, I suppose it wants to test > number of stmt executions of the condtional to be at least 2 which depends on > whether the conditional is before or after the loop exits). This made me > notice > that some loop passes that want to give up on low trip count uses combination > of estimated number of iterations and max number of iterations while other > don't which is fixed here. The code combining both realistic and max counts > is same as i.e. in unroller and other passes already. > > I also wonder if predictive comming is a win for loops with very low > trip count, but that is for separate patch, too, anyway. > > Bootstrapped/regtested x86_64-linux, OK? > > Honza > > * tree-ssa-loop-niter.c (idx_infer_loop_bounds): We can't get realistic > estimates here. > * tree-ssa-loop-unswitch.c (tree_unswitch_single_loop): Use also > max_loop_iterations_int. > * tree-ssa-loop-ivopts.c (avg_loop_niter): Likewise. > * tree-vect-loop.c (vect_analyze_loop_2): Likewise. > Index: tree-ssa-loop-niter.c > =================================================================== > --- tree-ssa-loop-niter.c (revision 234516) > +++ tree-ssa-loop-niter.c (working copy) > @@ -3115,7 +3115,6 @@ idx_infer_loop_bounds (tree base, tree * > tree low, high, type, next; > bool sign, upper = true, at_end = false; > struct loop *loop = data->loop; > - bool reliable = true; > > if (TREE_CODE (base) != ARRAY_REF) > return true; > @@ -3187,14 +3186,14 @@ idx_infer_loop_bounds (tree base, tree * > && tree_int_cst_compare (next, high) <= 0) > return true; > > - /* If access is not executed on every iteration, we must ensure that > overlow may > - not make the access valid later. */ > + /* If access is not executed on every iteration, we must ensure that > overlow > + may not make the access valid later. */ > if (!dominated_by_p (CDI_DOMINATORS, loop->latch, gimple_bb (data->stmt)) > && scev_probably_wraps_p (initial_condition_in_loop_num (ev, > loop->num), > step, data->stmt, loop, true)) > - reliable = false; > + upper = false; > > - record_nonwrapping_iv (loop, init, step, data->stmt, low, high, reliable, > upper); > + record_nonwrapping_iv (loop, init, step, data->stmt, low, high, false, > upper); > return true; > } > > Index: tree-ssa-loop-unswitch.c > =================================================================== > --- tree-ssa-loop-unswitch.c (revision 234516) > +++ tree-ssa-loop-unswitch.c (working copy) > @@ -223,6 +223,8 @@ tree_unswitch_single_loop (struct loop * > /* If the loop is not expected to iterate, there is no need > for unswitching. */ > iterations = estimated_loop_iterations_int (loop); > + if (iterations < 0) > + iterations = max_loop_iterations_int (loop);
You are only changing one place in this file. > if (iterations >= 0 && iterations <= 1) > { > if (dump_file && (dump_flags & TDF_DETAILS)) > Index: tree-ssa-loop-ivopts.c > =================================================================== > --- tree-ssa-loop-ivopts.c (revision 234516) > +++ tree-ssa-loop-ivopts.c (working copy) > @@ -121,7 +121,11 @@ avg_loop_niter (struct loop *loop) > { > HOST_WIDE_INT niter = estimated_stmt_executions_int (loop); > if (niter == -1) > - return AVG_LOOP_NITER (loop); > + { > + niter = max_stmt_executions_int (loop); > + if (niter == -1 || niter > AVG_LOOP_NITER (loop)) > + return AVG_LOOP_NITER (loop); > + } > > return niter; > } > Index: tree-vect-loop.c > =================================================================== > --- tree-vect-loop.c (revision 234516) > +++ tree-vect-loop.c (working copy) > @@ -2063,6 +2063,9 @@ start_over: > > estimated_niter > = estimated_stmt_executions_int (LOOP_VINFO_LOOP (loop_vinfo)); > + if (estimated_niter != -1) > + estimated_niter > + = max_stmt_executions_int (LOOP_VINFO_LOOP (loop_vinfo)); > if (estimated_niter != -1 > && ((unsigned HOST_WIDE_INT) estimated_niter > <= MAX (th, (unsigned)min_profitable_estimate))) The vectorizer already checks this (albeit indirectly): HOST_WIDE_INT max_niter = max_stmt_executions_int (LOOP_VINFO_LOOP (loop_vinfo)); if ((LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo) && (LOOP_VINFO_INT_NITERS (loop_vinfo) < vectorization_factor)) || (max_niter != -1 && (unsigned HOST_WIDE_INT) max_niter < vectorization_factor)) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, "not vectorized: iteration count smaller than " "vectorization factor.\n"); return false; } Richard. > -- Richard Biener <rguent...@suse.de> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)