Re: [PATCH] [PR68097] Try to avoid recursing for floats in tree_*_nonnegative_warnv_p.

Aldy Hernandez via Gcc-patches Wed, 16 Nov 2022 09:38:26 -0800



On 11/16/22 17:04, Richard Biener wrote:

On Tue, Nov 15, 2022 at 11:46 AM Aldy Hernandez <al...@redhat.com> wrote:




On 11/15/22 08:15, Richard Biener wrote:

On Mon, Nov 14, 2022 at 8:05 PM Aldy Hernandez <al...@redhat.com> wrote:




On 11/14/22 10:12, Richard Biener wrote:

On Sat, Nov 12, 2022 at 7:30 PM Aldy Hernandez <al...@redhat.com> wrote:


It irks me that a PR named "we should track ranges for floating-point
hasn't been closed in this release.  This is an attempt to do just
that.

As mentioned in the PR, even though we track ranges for floats, it has
been suggested that avoiding recursing through SSA defs in
gimple_assign_nonnegative_warnv_p is also a goal.  We can do this with
various ranger components without the need for a heavy handed approach
(i.e. a full ranger).

I have implemented two versions of known_float_sign_p() that answer
the question whether we definitely know the sign for an operation or a
tree expression.

Both versions use get_global_range_query, which is a wrapper to query
global ranges.  This means, that no caching or propagation is done.
In the case of an SSA, we just return the global range for it (think
SSA_NAME_RANGE_INFO).  In the case of a tree code with operands, we
also use get_global_range_query to resolve the operands, and then call
into range-ops, which is our lowest level component.  There is no
ranger or gori involved.  All we're doing is resolving the operation
with the ranges passed.

This is enough to avoid recursing in the case where we definitely know
the sign of a range.  Otherwise, we still recurse.

Note that instead of get_global_range_query(), we could use
get_range_query() which uses a ranger (if active in a pass), or
get_global_range_query if not.  This would allow passes that have an
active ranger (with enable_ranger) to use a full ranger.  These passes
are currently, VRP, loop unswitching, DOM, loop versioning, etc.  If
no ranger is active, get_range_query defaults to global ranges, so
there's no additional penalty.

Would this be acceptable, at least enough to close (or rename the PR ;-))?


I think the checks would belong to the gimple_stmt_nonnegative_warnv_p function
only (that's the SSA name entry from the fold-const.cc ones)?


That was my first approach, but I thought I'd cover the unary and binary
operators as well, since they had other callers.  But I'm happy with
just the top-level tweak.  It's a lot less code :).


@@ -9234,6 +9235,15 @@ bool
   gimple_stmt_nonnegative_warnv_p (gimple *stmt, bool *strict_overflow_p,
                                   int depth)
   {
+  tree type = gimple_range_type (stmt);
+  if (type && frange::supports_p (type))
+    {
+      frange r;
+      bool sign;
+      return (get_global_range_query ()->range_of_stmt (r, stmt)
+             && r.signbit_p (sign)
+             && sign == false);
+    }

the above means we never fall through to the switch below if
frange::supports_p (type) - that's eventually good enough, I
don't think we ever call this very function directly but it gets
invoked via recursion through operands only.  But of course


Woah, sorry.  That was not intended.  For that matter, the patch as
posted caused:

FAIL: gcc.dg/builtins-10.c (test for excess errors)
FAIL: gcc.dg/builtins-57.c (test for excess errors)
FAIL: gcc.dg/torture/builtin-nonneg-1.c   -O1  (test for excess errors)
FAIL: gcc.dg/torture/builtin-nonneg-1.c   -O2  (test for excess errors)
FAIL: gcc.dg/torture/builtin-nonneg-1.c   -O2 -flto
-fno-use-linker-plugin -flto-partition=none  (test for excess errors)
FAIL: gcc.dg/torture/builtin-nonneg-1.c   -O3 -g  (test for excess errors)
FAIL: gcc.dg/torture/builtin-nonneg-1.c   -Os  (test for excess errors)
FAIL: gcc.dg/torture/builtin-power-1.c   -O1  (test for excess errors)
FAIL: gcc.dg/torture/builtin-power-1.c   -O2  (test for excess errors)
FAIL: gcc.dg/torture/builtin-power-1.c   -O2 -flto
-fno-use-linker-plugin -flto-partition=none  (test for excess errors)
FAIL: gcc.dg/torture/builtin-power-1.c   -O3 -g  (test for excess errors)
FAIL: gcc.dg/torture/builtin-power-1.c   -Os  (test for excess errors)


Did you investigate why?  Because the old patch removed the recursion
while the new keeps it in case the global range isn't present which isn't
as nice.

For gcc.dg/builtins-10.c, there are a few calls togimple_stmt_nonnegative* that are being made before we have globalranges (ccp1 and forwprop1), so the query returns VARYING (i.e. no knownsign). If you're curious, the call to gimple_stmt_nonnegative* comesvia courtesy of match.pd:


/* Canonicalization of sequences of math builtins.  These rules represent
   IL simplifications but are not necessarily optimizations.

So ISTM, we still need to fall through if we're being called beforeglobal ranges are available.

After global ranges are available (evrp), we would avoid further lookupsif it weren't for an unrelated problem I found.

foperator_abs::fold_range() is trying to set a range of [+0.0, +INF],but this little snpipet in the frange normalization code adds a -0.0 tothe range:


  else if (!HONOR_SIGNED_ZEROS (m_type))
    {
      if (real_iszero (&m_max, 1))
        m_max.sign = 0;
      if (real_iszero (&m_min, 0))
        m_min.sign = 1;
    }

We end up with:

[frange] double [-0.0 (-0x0.0p+0),1.79769313486231570814527423731704356798070567525844996599e+308(0x0.fffffffffffff8p+1024)]

I must say this is beyond my paygrade :). Jakub, it was your suggestionto add the snippet above. Is this correct? Note that this test is for-ffast-math.

If I comment out the code above, the regressions are fixed, both with mycurrent patch or with the original one. But as I suggested, maybe wewant the second patch, because we may be called before global ranges areavailable.

IMHO, we could go with the second patch, and fix the ABS problemindependently.


Yay?  Nay?

Aldy

Note that ranger folding calls this function, though it won't run the
risk of endless recursion because range_of_stmt uses the LHS, and only
use global ranges to solve the LHS.

Also, frange::supports_p() does not support all floats:

    static bool supports_p (const_tree type)
    {
      // ?? Decimal floats can have multiple representations for the
      // same number.  Supporting them may be as simple as just
      // disabling them in singleton_p.  No clue.
      return SCALAR_FLOAT_TYPE_P (type) && !DECIMAL_FLOAT_TYPE_P (type);
    }


OK, _Complex types are obviously missing, so are vector ones.

Finally, my patch is more conservative than what the *nonnegative_warn*
friends do.  We only return true when we're sure about the sign bit and
it's FALSE.  As I mentioned elsewhere, tree_call_nonnegative_warn_p()
always returns true for:

      CASE_CFN_ACOS:
      CASE_CFN_ACOS_FN:
      CASE_CFN_ACOSH:
      CASE_CFN_ACOSH_FN:
      CASE_CFN_CABS:
      CASE_CFN_CABS_FN:
...
...
        /* Always true.  */
        return true;

This means that we'll return true for a NAN, but we're incorrectly
assuming the NAN will be +NAN.  In my proposed patch, we don't make such
assumptions.  We only return true if the range is non-negative,
including the NAN.


Yep, the usual issue whether nonnegative means copysign (1, x) produces
1 or whether !(x < 0) is true.

I wonder what types are not supported by frange and whether
the manual processing we fall through to does anything meaningful
for those?

I won't ask you to thoroughly answer this now but please put in
a comment reflecting the above before the switch stmt.

     switch (gimple_code (stmt))


Otherwise OK, in case you tree gets back to bootstrapping ;)


Updated patch that passes test.

OK?  And if so, can I close the PR?


Yes, I think we now track float ranges - improvements are of course
always possible.

Richard.

Thanks.
Aldy

Re: [PATCH] [PR68097] Try to avoid recursing for floats in tree_*_nonnegative_warnv_p.

Reply via email to