On Wed, 30 Mar 2016, Jan Hubicka wrote:

> Hi,
> while looking into sudoku solving benchark, I noticed that we incorrectly
> estimate loop to iterate 10 times just because the array it traverses is of
> dimension 10. This of course is just upper bound and not realistic bound.
> Realistically those loops iterates once most of time.
> 
> It turns out this bug was introduced by me in
> https://gcc.gnu.org/ml/gcc-patches/2013-01/msg00444.html
> While I do not recall doing that patch, it seems like a thinko about reliable
> (name of the variable) and realistic (name of the parameter it is passed to).
> 
> Fixing this caused one testsuite fallout in predictive commoning testcase
> because loop unswitching previously disabled itself having an estimated number
> of iterations 1 (I am not sure if that is not supposed to be 0, with 1
> iteration unswithcing may pay back, little bit, I suppose it wants to test
> number of stmt executions of the condtional to be at least 2 which depends on
> whether the conditional is before or after the loop exits). This made me 
> notice
> that some loop passes that want to give up on low trip count uses combination
> of estimated number of iterations and max number of iterations while other
> don't which is fixed here. The code combining both realistic and max counts
> is same as i.e. in unroller and other passes already.
> 
> I also wonder if predictive comming is a win for loops with very low
> trip count, but that is for separate patch, too, anyway.
> 
> Bootstrapped/regtested x86_64-linux, OK?
> 
> Honza
> 
>       * tree-ssa-loop-niter.c (idx_infer_loop_bounds): We can't get realistic
>       estimates here.
>       * tree-ssa-loop-unswitch.c (tree_unswitch_single_loop): Use also
>       max_loop_iterations_int.
>       * tree-ssa-loop-ivopts.c (avg_loop_niter): Likewise.
>       * tree-vect-loop.c (vect_analyze_loop_2): Likewise.
> Index: tree-ssa-loop-niter.c
> ===================================================================
> --- tree-ssa-loop-niter.c     (revision 234516)
> +++ tree-ssa-loop-niter.c     (working copy)
> @@ -3115,7 +3115,6 @@ idx_infer_loop_bounds (tree base, tree *
>    tree low, high, type, next;
>    bool sign, upper = true, at_end = false;
>    struct loop *loop = data->loop;
> -  bool reliable = true;
>  
>    if (TREE_CODE (base) != ARRAY_REF)
>      return true;
> @@ -3187,14 +3186,14 @@ idx_infer_loop_bounds (tree base, tree *
>        && tree_int_cst_compare (next, high) <= 0)
>      return true;
>  
> -  /* If access is not executed on every iteration, we must ensure that 
> overlow may
> -     not make the access valid later.  */
> +  /* If access is not executed on every iteration, we must ensure that 
> overlow
> +     may not make the access valid later.  */
>    if (!dominated_by_p (CDI_DOMINATORS, loop->latch, gimple_bb (data->stmt))
>        && scev_probably_wraps_p (initial_condition_in_loop_num (ev, 
> loop->num),
>                               step, data->stmt, loop, true))
> -    reliable = false;
> +    upper = false;
>  
> -  record_nonwrapping_iv (loop, init, step, data->stmt, low, high, reliable, 
> upper);
> +  record_nonwrapping_iv (loop, init, step, data->stmt, low, high, false, 
> upper);
>    return true;
>  }
>  
> Index: tree-ssa-loop-unswitch.c
> ===================================================================
> --- tree-ssa-loop-unswitch.c  (revision 234516)
> +++ tree-ssa-loop-unswitch.c  (working copy)
> @@ -223,6 +223,8 @@ tree_unswitch_single_loop (struct loop *
>        /* If the loop is not expected to iterate, there is no need
>        for unswitching.  */
>        iterations = estimated_loop_iterations_int (loop);
> +      if (iterations < 0)
> +        iterations = max_loop_iterations_int (loop);

You are only changing one place in this file.

>        if (iterations >= 0 && iterations <= 1)
>       {
>         if (dump_file && (dump_flags & TDF_DETAILS))
> Index: tree-ssa-loop-ivopts.c
> ===================================================================
> --- tree-ssa-loop-ivopts.c    (revision 234516)
> +++ tree-ssa-loop-ivopts.c    (working copy)
> @@ -121,7 +121,11 @@ avg_loop_niter (struct loop *loop)
>  {
>    HOST_WIDE_INT niter = estimated_stmt_executions_int (loop);
>    if (niter == -1)
> -    return AVG_LOOP_NITER (loop);
> +    {
> +      niter = max_stmt_executions_int (loop);
> +      if (niter == -1 || niter > AVG_LOOP_NITER (loop))
> +        return AVG_LOOP_NITER (loop);
> +    }
>  
>    return niter;
>  }
> Index: tree-vect-loop.c
> ===================================================================
> --- tree-vect-loop.c  (revision 234516)
> +++ tree-vect-loop.c  (working copy)
> @@ -2063,6 +2063,9 @@ start_over:
>  
>    estimated_niter
>      = estimated_stmt_executions_int (LOOP_VINFO_LOOP (loop_vinfo));
> +  if (estimated_niter != -1)
> +    estimated_niter
> +      = max_stmt_executions_int (LOOP_VINFO_LOOP (loop_vinfo));
>    if (estimated_niter != -1
>        && ((unsigned HOST_WIDE_INT) estimated_niter
>            <= MAX (th, (unsigned)min_profitable_estimate)))

The vectorizer already checks this (albeit indirectly):

  HOST_WIDE_INT max_niter
    = max_stmt_executions_int (LOOP_VINFO_LOOP (loop_vinfo));
  if ((LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
       && (LOOP_VINFO_INT_NITERS (loop_vinfo) < vectorization_factor))
      || (max_niter != -1
          && (unsigned HOST_WIDE_INT) max_niter < vectorization_factor))
    {
      if (dump_enabled_p ())
        dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
                         "not vectorized: iteration count smaller than "
                         "vectorization factor.\n");
      return false;
    }

Richard.

> 

-- 
Richard Biener <rguent...@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

Reply via email to