On Mon, 17 Jan 2022, Jiufu Guo wrote:

> Richard Biener <rguent...@suse.de> writes:
> 
> > On Fri, 14 Jan 2022, Jiufu Guo wrote:
> >
> >> Richard Biener <rguent...@suse.de> writes:
> >> 
> >> > On Thu, 13 Jan 2022, guojiufu wrote:
> >> >
> >> >> On 2022-01-03 22:30, Richard Biener wrote:
> >> >> > On Wed, 22 Dec 2021, Jiufu Guo wrote:
> >> >> > 
> >> >> >> Hi,
> >> >> >> ...
> >> >> >> 
> >> >> >> Bootstrap and regtest pass on ppc64* and x86_64.  Is this ok for 
> >> >> >> trunk?
> >> >> > 
>  may means infer_loop_bounds_from_undefined may not useful
> before IV's scev is ready.

I think it's also SCEV does not get optimal results without
niter estimates (or rather filled control IVs).

> > override the global caching of analyze_scalar_evolution the per
> > SSA name cache for SCEV analysis isn't overridden.  That also means
> > computing the estimates early will break your patch (I've
> > tested calling estimate_numbers_of_iterations explicitely from
> > loop distribution for example).
> Calling simple_iv_with_niters/simple_iv early may break the patch,
> because only inside number_of_iterations_exit_assumptions, the flag
> is disabled.
> 
> >
> > What I'm trying to see is whether we can do some more concious
> > setup of control IV computation and estimate computation.  While
> > your patch produces the desired result it is far from obvious
> > why exactly it does this since it also does it in a "midlevel"
> > place (of course my attempts to do it in a more obvious position
> > failed).
> >
> > But first of all yes, we should disallow / early out on recursions
> > via public APIs like estimate_numbers_of_iterations (already done)
> > or number_of_latch_executions (missing) or 
> > number_of_iterations_exit[_assumptions] (very difficult there).
> 
> I'm wondering if we can set loop->nb_iterations=chrec_dont_know
> or chrec_known in number_of_iterations_exit_assumption at the
> begining, and use it as a guard to avoid recursions on them.

Yes, I tried that but in number_of_latch_executions which is the
one eventually setting loop->nb_iterations.  It's more difficult
to avoid recursing in number_of_iterations_exit, but in theory
we could add a niter specific edge flag like EDGE_NITER_IN_PROGRESS
we could set.

> >
> > IMHO that we lazily compute estimate_numbers_of_iterations and that
> > computes niter for each exit of a loop is a misdesign - the estimate
> > should be computed from the toplevel, and maybe independently for
> > each exit?  I think that Honza changed it this way to make sure the
> > estimates are always available when queried but I may mis-remember.
> >
> > With your patch, ontop of that limiting recursion of
> > number_of_latch_executions doesn't break the positive effect
> > at least.
> >
> > One unwanted side-effect of your patch might be that the
> > computed estimate is now recorded w/o infer_loop_bounds_from_undefined
> > which means it could be worse (and we won't re-compute it).
> Yes, estimate was computed but infer_loop_bounds_from_undefined was
> not called, and it will never be called before estimate is reset.
> 
> >
> > All in all it is somewhat of a mess and I'm not convinced your
> > patch is an improvement in this regard :/  It looks like there's
> > a chicken and egg problem with using and recording (at least one)
> > control IV and SCEV caching whilst searching for one.
> 
> Thanks for your comments, and welcome any sugguestions!

To address the missed optimization in the testcase I think we need
a way to roll back parts of SCEV caching so that
estimate_numbers_of_iterations can wipe all caching it caused
(and only that) after it is done.  To then get the maximal positive
effect we would also need to ensure that whenever we "visit" a loop
with SCEV we "first" do estimate_numbers_of_iterations (that's the
more difficult part I think, but maybe the less important one).

The first part involves the 'scalar_evolution_info' cache where
a solution would for example involve chaining the entries with
a 'next' pointer and keeping track of the last added entry so 
estimate_numbers_of_iterations can remember the last added entry
at start and remove those added.  So maybe add scev_{push,pop}_htab
functions (not sure if 'push' is a good name, the present entries
will still be used just after 'pop' all entries added between the
last push and the pop will be removed).

Doing that in estimate_numbers_of_iterations will cause some
extra work - maybe the function can pop() already after all
control IVs are recorded, so the infer_loop_bounds_from_undefined
work isn't wasted.

All of this is IMHO stage1 stuff now but would be really useful
to have.  Which is why I'm adding a note to my very large
RIRO (random-in-random-out ;)) TODO list - feel free to beat me
on it!

Thanks,
Richard.

> BR,
> Jiufu
> 
> >
> > Richard.
> >
> >
> >> >
> >> > I also tried
> >> >
> >> > diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c
> >> > index 935d2d4d8f3..d1787ba39b6 100644
> >> > --- a/gcc/tree-ssa-loop-ivopts.c
> >> > +++ b/gcc/tree-ssa-loop-ivopts.c
> >> > @@ -8093,6 +8093,8 @@ tree_ssa_iv_optimize_loop (struct ivopts_data 
> >> > *data, 
> >> > class loop *loop,
> >> >        fprintf (dump_file, "\n");
> >> >      }
> >> >  
> >> > +  estimate_numbers_of_iterations (loop);
> >> > +
> >> >    body = get_loop_body (loop);
> >> >    data->body_includes_call = loop_body_includes_call (body, 
> >> > loop->num_nodes);
> >> >    renumber_gimple_stmt_uids_in_blocks (body, loop->num_nodes);
> >> >
> >> > to get into the cycles from the "correct" entry but that does
> >> > not help either.  Nor does
> >> >
> >> > diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
> >> > index b767056aeb0..f058be04836 100644
> >> > --- a/gcc/tree-ssa-loop-niter.c
> >> > +++ b/gcc/tree-ssa-loop-niter.c
> >> > @@ -2534,6 +2534,14 @@ number_of_iterations_exit_assumptions (class loop 
> >> > *loop, edge exit,
> >> >        && !POINTER_TYPE_P (type))
> >> >      return false;
> >> >  
> >> > +  if (loop->estimate_state == EST_NOT_COMPUTED)
> >> > +    {
> >> > +      bool saved = flag_aggressive_loop_optimizations;
> >> > +      flag_aggressive_loop_optimizations = false;
> >> > +      estimate_numbers_of_iterations (loop);
> >> > +      flag_aggressive_loop_optimizations = saved;
> >> > +    }
> >> > +
> >> >    tree iv0_niters = NULL_TREE;
> >> >    if (!simple_iv_with_niters (loop, loop_containing_stmt (stmt),
> >> >                               op0, &iv0, safe ? &iv0_niters : NULL, 
> >> > false))
> >> >
> >> > or
> >> >
> >> > diff --git a/gcc/tree-scalar-evolution.c b/gcc/tree-scalar-evolution.c
> >> > index 61d72c278a1..d1af89eb459 100644
> >> > --- a/gcc/tree-scalar-evolution.c
> >> > +++ b/gcc/tree-scalar-evolution.c
> >> > @@ -518,7 +518,8 @@ set_scalar_evolution (basic_block 
> >> > instantiated_below, 
> >> > tree scalar, tree chrec)
> >> >         nb_set_scev++;
> >> >      }
> >> >  
> >> > -  *scalar_info = chrec;
> >> > +  if (*scalar_info == chrec_not_analyzed_yet)
> >> > +    *scalar_info = chrec;
> >> >  }
> >> >  
> >> >  /* Retrieve the chrec associated to SCALAR instantiated below
> >> >
> >> > That said, having the cycles is bad, we should definitively break
> >> > them (if only for compile-time reasons).  But I don't really
> >> > understand how your patch helps and my attempts do not which
> >> > means I am missing a critical piece of understanding ... :/
> >> 
> >> This patch disables flag_aggressive_loop_optimizations before
> >> analyze_scalar_evolution is called, then lv_10's scev is not
> >> computed during this call cycle.  lv_10's scev is calculated
> >> when it other passes (e.g. ivopt) request, at that time i_15
> >> has 'final' scev.
> >> 
> >> I had also tried to disable recursive from analyze_scalar_evolution
> >> on the same ssa name(i_15), and directly get the final result.  But
> >> I'm not finding a good way yet.
> >> 
> >> Again thanks for your suggestions!
> >> 
> >> Thanks!
> >> Jiufu
> >> 
> >> >
> >> > Thanks,
> >> > Richard.
> >> >
> >> >> Thanks again.
> >> >> 
> >> >> BR,
> >> >> Jiufu
> >> >> 
> >> >> > to SCEV and thus may recurse again - to me it would be more
> >> >> > logical to try avoid recursing in number_of_latch_executions by
> >> >> > setting ->nb_iterations to something early, maybe chrec_dont_know,
> >> >> > to signal we're using something we're just trying to compute.
> >> >> > 
> >> >> > Richard.
> >> >> > 
> >> >> >> BR,
> >> >> >> Jiufu
> >> >> >> 
> >> >> >> gcc/ChangeLog:
> >> >> >> 
> >> >> >>  * tree-ssa-loop-niter.c (number_of_iterations_exit_assumptions):
> >> >> >>  Disable/restore flag_aggressive_loop_optimizations.
> >> >> >> 
> >> >> >> gcc/testsuite/ChangeLog:
> >> >> >> 
> >> >> >>  * gcc.dg/tree-ssa/scev-16.c: New test.
> >> >> >> 
> >> >> >> ---
> >> >> >>  gcc/tree-ssa-loop-niter.c               | 23 +++++++++++++++++++----
> >> >> >>  gcc/testsuite/gcc.dg/tree-ssa/scev-16.c | 20 ++++++++++++++++++++
> >> >> >>  2 files changed, 39 insertions(+), 4 deletions(-)
> >> >> >>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/scev-16.c
> >> >> >> 
> >> >> >> diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
> >> >> >> index 06954e437f5..51bb501019e 100644
> >> >> >> --- a/gcc/tree-ssa-loop-niter.c
> >> >> >> +++ b/gcc/tree-ssa-loop-niter.c
> >> >> >> @@ -2534,18 +2534,31 @@ number_of_iterations_exit_assumptions (class 
> >> >> >> loop
> >> >> >> *loop, edge exit,
> >> >> >>        && !POINTER_TYPE_P (type))
> >> >> >>      return false;
> >> >> >> 
> >> >> >> +  /* Before niter is calculated, avoid to analyze interim state. */
> >> >> >> +  int old_aggressive_loop_optimizations =
> >> >> >> flag_aggressive_loop_optimizations;
> >> >> >> +  flag_aggressive_loop_optimizations = 0;
> >> >> >> +
> >> >> >>    tree iv0_niters = NULL_TREE;
> >> >> >>    if (!simple_iv_with_niters (loop, loop_containing_stmt (stmt),
> >> >> >>                            op0, &iv0, safe ? &iv0_niters : NULL, 
> >> >> >> false))
> >> >> >> -    return number_of_iterations_popcount (loop, exit, code, niter);
> >> >> >> +    {
> >> >> >> +      bool res = number_of_iterations_popcount (loop, exit, code, 
> >> >> >> niter);
> >> >> >> +      flag_aggressive_loop_optimizations =
> >> >> >> old_aggressive_loop_optimizations;
> >> >> >> +      return res;
> >> >> >> +    }
> >> >> >>    tree iv1_niters = NULL_TREE;
> >> >> >>    if (!simple_iv_with_niters (loop, loop_containing_stmt (stmt),
> >> >> >>                            op1, &iv1, safe ? &iv1_niters : NULL, 
> >> >> >> false))
> >> >> >> -    return false;
> >> >> >> +    {
> >> >> >> +      flag_aggressive_loop_optimizations =
> >> >> >> old_aggressive_loop_optimizations;
> >> >> >> +      return false;
> >> >> >> +    }
> >> >> >>    /* Give up on complicated case.  */
> >> >> >>    if (iv0_niters && iv1_niters)
> >> >> >> -    return false;
> >> >> >> -
> >> >> >> +    {
> >> >> >> +      flag_aggressive_loop_optimizations =
> >> >> >> old_aggressive_loop_optimizations;
> >> >> >> +      return false;
> >> >> >> +    }
> >> >> >>    /* We don't want to see undefined signed overflow warnings while
> >> >> >>       computing the number of iterations.  */
> >> >> >>    fold_defer_overflow_warnings ();
> >> >> >> @@ -2565,6 +2578,7 @@ number_of_iterations_exit_assumptions (class 
> >> >> >> loop
> >> >> >> *loop, edge exit,
> >> >> >>                                        only_exit_p, safe))
> >> >> >>      {
> >> >> >>        fold_undefer_and_ignore_overflow_warnings ();
> >> >> >> +      flag_aggressive_loop_optimizations =
> >> >> >> old_aggressive_loop_optimizations;
> >> >> >>        return false;
> >> >> >>      }
> >> >> >> 
> >> >> >> @@ -2608,6 +2622,7 @@ number_of_iterations_exit_assumptions (class 
> >> >> >> loop
> >> >> >> *loop, edge exit,
> >> >> >>              niter->may_be_zero);
> >> >> >> 
> >> >> >>    fold_undefer_and_ignore_overflow_warnings ();
> >> >> >> +  flag_aggressive_loop_optimizations = 
> >> >> >> old_aggressive_loop_optimizations;
> >> >> >> 
> >> >> >>    /* If NITER has simplified into a constant, update MAX.  */
> >> >> >>    if (TREE_CODE (niter->niter) == INTEGER_CST)
> >> >> >> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/scev-16.c
> >> >> >> b/gcc/testsuite/gcc.dg/tree-ssa/scev-16.c
> >> >> >> new file mode 100644
> >> >> >> index 00000000000..708ffab88ca
> >> >> >> --- /dev/null
> >> >> >> +++ b/gcc/testsuite/gcc.dg/tree-ssa/scev-16.c
> >> >> >> @@ -0,0 +1,20 @@
> >> >> >> +/* { dg-do compile } */
> >> >> >> +/* { dg-options "-O2 -fdump-tree-ivopts-scev" } */
> >> >> >> +
> >> >> >> +/* Expect scalar_evolution = {(int) start_7(D), +, 1}_1), instead
> >> >> >> +   (int) (short int) {(unsigned short) start_7(D), +, 1}_1 */
> >> >> >> +
> >> >> >> +int arr[1000];
> >> >> >> +
> >> >> >> +void
> >> >> >> +s2i (short start, short end)
> >> >> >> +{
> >> >> >> +  int res = 0;
> >> >> >> +  for (short i = start; i < end; i++)
> >> >> >> +    {
> >> >> >> +      int lv = i;
> >> >> >> +      arr[lv] += lv;
> >> >> >> +    }
> >> >> >> +}
> >> >> >> +
> >> >> >> +/* { dg-final { scan-tree-dump-times "scalar_evolution = \{\\(int\\)
> >> >> >> start_\[0-9\]+\\(D\\), \\+, 1\}_1" 1 "ivopts" } } */
> >> >> >> 
> >> >> 
> >> >> 
> >> 
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)

Reply via email to