On Fri, Jul 2, 2021 at 3:37 AM Bin.Cheng <amker.ch...@gmail.com> wrote:
>
> On Thu, Jul 1, 2021 at 8:19 PM Richard Biener
> <richard.guent...@gmail.com> wrote:
> >
> > On Mon, Jun 7, 2021 at 4:35 PM Richard Biener
> > <richard.guent...@gmail.com> wrote:
> > >
> > > On Sun, Jun 6, 2021 at 12:01 PM Bin.Cheng <amker.ch...@gmail.com> wrote:
> > > >
> > > > On Wed, Jun 2, 2021 at 3:28 PM Richard Biener via Gcc-patches
> > > > <gcc-patches@gcc.gnu.org> wrote:
> > > > >
> > > > > On Tue, Jun 1, 2021 at 4:00 PM bin.cheng via Gcc-patches
> > > > > <gcc-patches@gcc.gnu.org> wrote:
> > > > > >
> > > > > > Hi,
> > > > > > As described in patch summary, this fixes the wrong code issue by 
> > > > > > adding overflow-ness
> > > > > > check for iv1.step - iv2.step.
> > > > > >
> > > > > > Bootstrap and test on x86_64.  Any comments?
> > > > >
> > > > > +         bool wrap_p = TYPE_OVERFLOW_WRAPS (step_type);
> > > > > +         if (wrap_p)
> > > > > +           {
> > > > > +             tree t = fold_binary_to_constant (GE_EXPR, step_type,
> > > > > +                                               iv0->step, iv1->step);
> > > > > +             wrap_p = integer_zerop (t);
> > > > > +           }
> > > > >
> > > > > I think we can't use TYPE_OVERFLOW_WRAPS/TYPE_OVERFLOW_UNDEFINED since
> > > > > that's only relevant for expressions written by the user - we're
> > > > > computing iv0.step - iv1.step
> > > > > which can even overflow when TYPE_OVERFLOW_UNDEFINED (in fact we may 
> > > > > not
> > > > > even generate this expression then!).  So I think we have to do sth 
> > > > > like
> > > > >
> > > > >    /* If the iv0->step - iv1->step wraps, fail.  */
> > > > >    if (!operand_equal_p (iv0->step, iv1->step)
> > > > >        && (TREE_CODE (iv0->step) != INTEGER_CST || TREE_CODE
> > > > > (iv1->step) != INTEGER_CST)
> > > > >        && !wi::gt (wi::to_widest (iv0->step), wi::to_widest 
> > > > > (iv1->step))
> > > > >      return false;
> > > > >
> > > > > which only handles equality and all integer constant steps. You could
> > > > Thanks for the suggestion.  I realized that we have LE/LT/NE
> > > > conditions here, and for LE/LT what we need to check is iv0/iv1
> > > > converge to each other, rather than diverge.  Also steps here can only
> > > > be constants, so there is no need to use range information.
> > >
> > > Ah, that simplifies things.
> > >
> > > +         if (tree_int_cst_lt (iv0->step, iv1->step))
> > > +           return false;
> > >
> > > so it looks to me that iv?->step can be negative which means we should
> > > verify that abs(iv0->step - iv1->step) <= abs (iv0->step), correct?
> > >
> > >        tree step = fold_binary_to_constant (MINUS_EXPR, step_type,
> > >                                            iv0->step, iv1->step);
> > > ...
> > > +         if (TREE_CODE (step) != INTEGER_CST)
> > > +           return false;
> > >
> > > note fold_binary_to_constant will return NULL if the result is not
> > > TREE_CONSTANT (which would also include symbolic constants
> > > like &a - &b).  It wasn't checked before, of course but since we're
> > > touching the code we might very well be checking for NULL step ...
> > > (or assert it is not for documentation purposes).
> > >
> > > That said, if iv0->step and iv1->step are known INTEGER_CSTs
> > > (I think they indeed are given the constraints we impose on
> > > simple_iv_with_niters).
> > >
> > > That said, with just a quick look again it looks to me the
> > > IV1 {<=,<} IV2 transform to IV1 - IV2step {<=,<} IV2base
> > > is OK whenever the effective step magnitude on the IV1'
> > > decreases, thus abs(IV1.step - IV2.step) <= abs(IV1.step)
> > > since then IV1 is still guaranteed to not overflow.  But
> > > for example {0, +, 1} and {10, -, 1} also converge if the
> > > number of iterations is less than 10 but they would not pass
> > > this criteria.  So I'm not sure "convergence" is a good wording
> > > here - but maybe I'm missing some critical piece of understanding
> > > here.
> > >
> > > But in any case it looks like we're on the right track ;)
> >
> > Just to pick up things where we left them (and seeing the patch to
> > unti-wrap which reminded me), I've digged in myself a bit and
> > came to the following conclusion.
> >
> > The b0 + s0 < b1 + s1 -> b0 + s0 - s1 < b1 transform is only
> > valid if b0 + s0 - s1 does not wrap which we can only ensure
> > by ensuring that b0 + s0 and b1 + s1 do not wrap (which we
> > already check) but also - what we're missing - that (s0 - s1)
> > makes b0 still evolve in the same direction (thus s0-s1 has the
> > same sign as s0) and that its magnitude is less that that of s0.
> >
> > Extra cases could be handled if we have an upper bound for
> > the number of iterations from other sources, not sure if that's
> > worth checking.
> >
> > Thus I am testing the attached now.
> >
> > Hope you don't mind - and I of course welcome any comments.
> Oh, not at all.  Your help on these issues are greatly appreciated.
>
> As for the change:
>
> > +         if (tree_int_cst_sign_bit (iv0->step) != tree_int_cst_sign_bit 
> > (step)
> > +             || wi::geu_p (wi::abs (wi::to_wide (step)),
> > +                           wi::abs (wi::to_wide (iv0->step))))
> It returns false on condition "{base, 5}_iv0 < {base, -1}_iv1", but
> this is like the "convergence" case I mentioned and could be analyzed?

Yes, I think so.  I just didn't manage to convince myself that a simple
step0 < step1 check is enough to ensure convergence.  So I settled
on the argument that the general
  x + CST1 < y + CST2   ->   X + CST1 - CST2 < y
transform is valid if the original adds did not wrap and the new
adds do not either.  Your step0 < step1 check would say converging
for {base, -1}_iv0 < {base, 1}_iv1 and that looked wrong - OTOH then
maybe we'd have computed iv0->no_overflow or iv1->no_overflow
to false since if the IVs do not converge there must be overflow for
the condition to switch from false to true (or vice versa)?  But then
the check would be not needed at all.

Hmm, so maybe our testcase analysis is wrong given the exit
we compute the wrong number of iterations is never taken, and when
it would be take the IVs would have wrapped but we likely computed
->no_overflow based on the _other_ exit which puts an upper bound
on the number of iterations ...?

So maybe adjust_cond_for_loop_until_wrap which is what in the
end adjusts things for the computation of the bogus niters assumes
the exit analyzed is eventually taken and that's a wrong assumption?

Richard.

> Thanks,
> bin
> > +           return false;
> > +       }
>
>
> Thanks,
> bin
> >
> > Thanks,
> > Richard.
> >
> > > Thanks,
> > > Richard.
> > >
> > > > > also use ranges
> > > > > like
> > > > >
> > > > >  wide_int min0, max0, min1, max1;
> > > > >   if (!operand_equal_p (iv->step, iv1->step)
> > > > >       && (determine_value_range (iv0->step, &min0, &max0) != VR_RANGE
> > > > >              || determine_value_range (iv1->step, &min1, &max1) != 
> > > > > VR_RANGE
> > > > >              || !wi::ge (min0, max1)))
> > > > >    return false;
> > > > >
> > > > > Note I'm not sure why
> > > > >
> > > > >        iv0->step = step;
> > > > >        if (!POINTER_TYPE_P (type))
> > > > >         iv0->no_overflow = false;
> > > > I don't exactly remember, this was added sometime when no_overflow was
> > > > introduced.  Note we only do various checks for non NE_EXPR so the
> > > > step isn't always less in absolute value?  I will check if we should
> > > > reset it in all cases.
> > > >
> > > > Patch updated.  test ongoing.
> > > >
> > > > Thanks,
> > > > bin
> > > > >
> > > > > here the no_overflow reset does not happen for pointer types?  Or
> > > > > rather why does
> > > > > it happen at all?  Don't we strictly make the step less in absolute 
> > > > > value?
> > > > >
> > > > > > Thanks,
> > > > > > bin

Reply via email to