Re: [patch] canonicalize unsigned [1,MAX] ranges into ~[0,0]

Aldy Hernandez Fri, 04 Oct 2019 09:15:46 -0700



On 10/4/19 12:02 PM, Jeff Law wrote:

On 10/4/19 9:49 AM, Aldy Hernandez wrote:



On 10/4/19 11:38 AM, Jeff Law wrote:

On 10/4/19 6:59 AM, Aldy Hernandez wrote:

When I did the value_range canonicalization work, I noticed that an
unsigned [1,MAX] and an ~[0,0] could be two different representations
for the same thing.  I didn't address the problem then because callers
to ranges_from_anti_range() would go into an infinite loop trying to
extract ~[0,0] into [1,MAX] and [].  We had a lot of callers to
ranges_from_anti_range, and it smelled like a rat's nest, so I bailed.

Now that we have one main caller (from the symbolic PLUS/MINUS
handling), it's a lot easier to contain.  Well, singleton_p also calls
it, but it's already handling nonzero specially, so it wouldn't be affected.



With some upcoming cleanups I'm about to post, the fact that [1,MAX] and
~[0,0] are equal_p(), but not nonzero_p(), matters.  Plus, it's just
good form to have one representation, giving us the ability to pick at
nonzero_p ranges with ease.

The code in extract_range_from_plus_minus_expr() continues to be a mess
(as it has always been), but at least it's contained, and with this
patch, it's slightly smaller.

Note, I'm avoiding adding a comment header for functions with highly
descriptive obvious names.

OK?

Aldy

canonicalize-nonzero-ranges.patch

commit 1c333730deeb4ddadc46ad6d12d5344f92c0352c
Author: Aldy Hernandez <al...@redhat.com>
Date:   Fri Oct 4 08:51:25 2019 +0200

      Canonicalize UNSIGNED [1,MAX] into ~[0,0].
           Adapt PLUS/MINUS symbolic handling, so it doesn't call
      ranges_from_anti_range with a VR_ANTI_RANGE containing one
sub-range.

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 6e4f145af46..3934b41fdf9 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,18 @@
+2019-10-04  Aldy Hernandez  <al...@redhat.com>
+
+    * tree-vrp.c (value_range_base::singleton_p): Use num_pairs
+    instead of calling vrp_val_is_*.
+    (value_range_base::set): Canonicalize unsigned [1,MAX] into
+    non-zero.
+    (range_has_numeric_bounds_p): New.
+    (range_int_cst_p): Use range_has_numeric_bounds_p.
+    (ranges_from_anti_range): Assert that we won't recurse
+    indefinitely.
+    (extract_extremes_from_range): New.
+    (extract_range_from_plus_minus_expr): Adapt so we don't call
+    ranges_from_anti_range with an anti-range containing only one
+    sub-range.

So no problem with the implementation, but I do have a higher level
question.

One of the goals of the representation side of the Ranger project is to
drop anti-ranges.  Canonicalizing [1, MAX] to ~[0,0] seems to be going
in the opposite direction.   So do we really want to canonicalize to
~[0,0]?


Hmmm, Andrew had the same question.

It really doesn't matter what we canonicalize too, as long as we're
consistent, but there are a bunch of non-zero tests throughout that were
checking for the ~[0,0] construct, and I didn't want to rock the boat
too much.  Although in all honesty, most of those should already be
converted to the ::nonzero_p() API.

However, if we canonicalize into [1,MAX] for unsigned, we have the
problem that a signed non-zero will still be ~[0,0], so our ::nonzero_p
code will have to check two different representations, not to mention it
will now have to check TYPE_UNSIGNED(type).

ISTM that the right thing to do is to first move to the ::nonzero_p API,
which should be a behavior preserving change.  It'd probably be a medium
sized change, but highly mechanical and behavior preserving, so easy to
review.

Ughh, please no? This was just a means to get the general range_fold*cleanups which are important and make everything easier to read. I'drather not get distracted by having to audit all the nonzero checkingthroughout.

Besides, we can't get away from anti-ranges inasmuch as we havevalue_range_base, which hopefully we can move away from in a futurerelease. So this will eventually become a non-issue. At that point,we'll loose ALL anti-ranges once and for all.

~[0,0] has been the accepted way for a long time, I'd really prefer tokeep that (for now).


And really, this is just plain ugly:

bool
value_range_base::nonzero_p ()
{
  if (m_kind == VR_ANTI_RANGE
      && !TYPE_UNSIGNED (type ())
      && integer_zerop (m_min)
      && integer_zerop (m_max))
    return true;

  return (m_kind == VR_RANGE
          && TYPE_UNSIGNED (type ())
          && integer_onep (m_min)
          && vrp_val_is_max (m_max));
}

Aldy


Once that's in place, then I'd seriously look at [1,MAX] as the
canonical form for unsigned types and [MIN, -1] [1, MAX] as the
canonical form for signed types.  If we're trying to get away from ANTI
ranges, canonicalizing on ~[0,0] just seems wrong.

Where things may get painful is things like [MIN, -3] [3, MAX] which
occur due to arithmetic and pointer alignments.  I think we have a hack
somewhere which special cases this into [MIN, -1], [1, MAX] even though
it's technically less precise.

jeff

Re: [patch] canonicalize unsigned [1,MAX] ranges into ~[0,0]

Reply via email to