On 10/4/19 12:02 PM, Jeff Law wrote:
On 10/4/19 9:49 AM, Aldy Hernandez wrote:


On 10/4/19 11:38 AM, Jeff Law wrote:
On 10/4/19 6:59 AM, Aldy Hernandez wrote:
When I did the value_range canonicalization work, I noticed that an
unsigned [1,MAX] and an ~[0,0] could be two different representations
for the same thing.  I didn't address the problem then because callers
to ranges_from_anti_range() would go into an infinite loop trying to
extract ~[0,0] into [1,MAX] and [].  We had a lot of callers to
ranges_from_anti_range, and it smelled like a rat's nest, so I bailed.

Now that we have one main caller (from the symbolic PLUS/MINUS
handling), it's a lot easier to contain.  Well, singleton_p also calls
it, but it's already handling nonzero specially, so it wouldn't be affected.



With some upcoming cleanups I'm about to post, the fact that [1,MAX] and
~[0,0] are equal_p(), but not nonzero_p(), matters.  Plus, it's just
good form to have one representation, giving us the ability to pick at
nonzero_p ranges with ease.

The code in extract_range_from_plus_minus_expr() continues to be a mess
(as it has always been), but at least it's contained, and with this
patch, it's slightly smaller.

Note, I'm avoiding adding a comment header for functions with highly
descriptive obvious names.

OK?

Aldy

canonicalize-nonzero-ranges.patch

commit 1c333730deeb4ddadc46ad6d12d5344f92c0352c
Author: Aldy Hernandez <al...@redhat.com>
Date:   Fri Oct 4 08:51:25 2019 +0200

      Canonicalize UNSIGNED [1,MAX] into ~[0,0].
           Adapt PLUS/MINUS symbolic handling, so it doesn't call
      ranges_from_anti_range with a VR_ANTI_RANGE containing one
sub-range.

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 6e4f145af46..3934b41fdf9 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,18 @@
+2019-10-04  Aldy Hernandez  <al...@redhat.com>
+
+    * tree-vrp.c (value_range_base::singleton_p): Use num_pairs
+    instead of calling vrp_val_is_*.
+    (value_range_base::set): Canonicalize unsigned [1,MAX] into
+    non-zero.
+    (range_has_numeric_bounds_p): New.
+    (range_int_cst_p): Use range_has_numeric_bounds_p.
+    (ranges_from_anti_range): Assert that we won't recurse
+    indefinitely.
+    (extract_extremes_from_range): New.
+    (extract_range_from_plus_minus_expr): Adapt so we don't call
+    ranges_from_anti_range with an anti-range containing only one
+    sub-range.
So no problem with the implementation, but I do have a higher level
question.

One of the goals of the representation side of the Ranger project is to
drop anti-ranges.  Canonicalizing [1, MAX] to ~[0,0] seems to be going
in the opposite direction.   So do we really want to canonicalize to
~[0,0]?

Hmmm, Andrew had the same question.

It really doesn't matter what we canonicalize too, as long as we're
consistent, but there are a bunch of non-zero tests throughout that were
checking for the ~[0,0] construct, and I didn't want to rock the boat
too much.  Although in all honesty, most of those should already be
converted to the ::nonzero_p() API.

However, if we canonicalize into [1,MAX] for unsigned, we have the
problem that a signed non-zero will still be ~[0,0], so our ::nonzero_p
code will have to check two different representations, not to mention it
will now have to check TYPE_UNSIGNED(type).
ISTM that the right thing to do is to first move to the ::nonzero_p API,
which should be a behavior preserving change.  It'd probably be a medium
sized change, but highly mechanical and behavior preserving, so easy to
review.

Ughh, please no? This was just a means to get the general range_fold* cleanups which are important and make everything easier to read. I'd rather not get distracted by having to audit all the nonzero checking throughout.

Besides, we can't get away from anti-ranges inasmuch as we have value_range_base, which hopefully we can move away from in a future release. So this will eventually become a non-issue. At that point, we'll loose ALL anti-ranges once and for all.

~[0,0] has been the accepted way for a long time, I'd really prefer to keep that (for now).

And really, this is just plain ugly:

bool
value_range_base::nonzero_p ()
{
  if (m_kind == VR_ANTI_RANGE
      && !TYPE_UNSIGNED (type ())
      && integer_zerop (m_min)
      && integer_zerop (m_max))
    return true;

  return (m_kind == VR_RANGE
          && TYPE_UNSIGNED (type ())
          && integer_onep (m_min)
          && vrp_val_is_max (m_max));
}

Aldy


Once that's in place, then I'd seriously look at [1,MAX] as the
canonical form for unsigned types and [MIN, -1] [1, MAX] as the
canonical form for signed types.  If we're trying to get away from ANTI
ranges, canonicalizing on ~[0,0] just seems wrong.

Where things may get painful is things like [MIN, -3] [3, MAX] which
occur due to arithmetic and pointer alignments.  I think we have a hack
somewhere which special cases this into [MIN, -1], [1, MAX] even though
it's technically less precise.

jeff

Reply via email to