On 12/15/20 10:29 PM, Jeff Law wrote:


On 11/11/20 6:09 PM, Martin Sebor via Gcc-patches wrote:
The attached patch builds on top of the series I posted last
week(*) to improve the detection of out of bounds pointers
and C++ references, as well as a subset of invalid pointer
relational and subtraction expressions.

First, as I mentioned last week, the simple compute_objsize
cache I implemented then is space inefficient.  The changes
in this update enhance the  cache to reduce the space overhead
and improve compile-time efficiency.  The cache now consists
of two arrays, one storing the indices to the other.
The former is indexed by SSA_NAME version.  The latter also
contains separate entries for sizes of enclosing objects and
their members (missing from the first attempt, leading to
inefficient hacks to overcome the limitation).   These
improvements let clients look up the provenance of any pointer
in O(1) time and avoid the hacks.

Second, with the necessary cache improvements above,
the gimple-array-bounds changes enhance array bounds checking
in two ways:
1) pointers passed to functions or used to initialize C++
references are checked to see if they're valid and in bounds
and diagnosed if not (a subset of instances of passing valid
just-past-past-the-end and so non-dereferenceable pointers to
functions are also diagnosed)
2) relational or difference expressions involving pointers are
checked to make sure they point to the same object and diagnosed
if not.

Besides bootstrapping and regtesting I have also tested the full
patch series with a few packages, including Binutils/GDB, Glibc
and Valgrind, and verified that it doesn't cause any false
positives.  The new -Wpointer-compare warning does trigger in
two or three places in each, for subtracting pointers to distinct
objects.  Those are true positives, but the code I checked looks
benign.  In such cases the warning can be suppressed by converting
the pointers to intptr_t before the subtraction.

Martin

[*] Prerequisite patches:
https://gcc.gnu.org/pipermail/gcc-patches/2020-November/558127.html
https://gcc.gnu.org/pipermail/gcc-patches/2020-November/557807.html
https://gcc.gnu.org/pipermail/gcc-patches/2020-November/557987.html

gcc-wpointer-compare.diff

Improve detection of invalid pointer operations.

gcc/ChangeLog:

        PR middle-end/93848
        * builtins.c (compute_objsize_r): Add argument.
        (access_ref::parm): New function.
        (access_ref::get_ref): Add argument.  Avoid null pointers.  Treat
        multiple PHIs with all arguments referencing the same object the same
        as references to that object with a range of offsets.  Clear BASE0
        for PHIs with arguments referencing distinct objects and different
        offsets.
        (access_ref::get_ref_type): New function.
        (access_ref::size_remaining): Make sure low bound of size is
        nonnegative.
        (pointer_query::pointer_query): Adjust ctor.
        (pointer_query::get_ref): Handle separate index and var caches.
        (pointer_query::put_ref): Same.
        (pointer_query::flush_cache): New function.
        (access_ref::inform_access): Differentiate pointers to allocated
        objects from those returned by non-allocation functions.
        (get_offset_range): Make extern.
        (handle_min_max_size): Add argument.  Pass it to compute_objsize.
        (compute_objsize_r, compute_objsize): Same.
        (gimple_call_alloc_p): Make extern.
        (maybe_emit_free_warning): Adjust and simplify test for built-in
        functions.
        * builtins.h (struct access_ref): Add new members, adjust existing.
        (get_offset_range): Declare.
        (gimple_call_alloc_p): Same.
        (compute_objsize): Add argument.
        * common.opt (-Wpointer-compare): Move option here from gcc/c.opt.
        * doc/invoke.texi (-Wpointer-compare): Update.
        * gimple-array-bounds.cc (ptrrefs): New variable.
        (format_offset_range): New function.
        (format_subscript_range): New function.
        (array_bounds_checker::check_array_ref): Remove argument.
        (array_bounds_checker::check_mem_ref): Remove argument.  Check
        no-warning bit on the current statement.  Use format_subscript_range.
        Simplify informational messages.
        (array_bounds_checker::check_addr_expr): Remove redundant argument
        from calls.
        (array_bounds_checker::ptrs_to_distinct): New function.
        (array_bounds_checker::check_pointer_op): Same.
        (array_bounds_checker::handle_assign): Same.
        (array_bounds_checker::handle_call): Same.
        (array_bounds_checker::check_array_bounds): Set new
        array_bounds_checker members.  Adjust calls.
        (check_array_bounds_dom_walker::before_dom_children): Call new
        array_bounds_checker members.  Set statement visited bit.
        (array_bounds_checker::check): Create and a pointer_query instance
        and flush its cache.
        * gimple-array-bounds.h (class array_bounds_checker): Add new members.
        * tree-ssa-strlen.c (maybe_warn_overflow): Pass a new argument to
        compute_objsize.
        (strlen_dom_walker::strlen_dom_walker): Initialize cache.  Adjust
        member type.
        (printf_strlen_execute): Improve debugging output.

gcc/c-family/ChangeLog:

        PR middle-end/93848
        * c.opt (-Wpointer-compare): Move option to gcc/common.opt.

gcc/c/ChangeLog:

        PR middle-end/93848
        * c-typeck.c (parser_build_binary_op): Adjust option name.

gcc/cp/ChangeLog:

        PR middle-end/93848
        * typeck.c (cp_build_binary_op): Adjust option name.

gcc/testsuite/ChangeLog:

        PR middle-end/93848
        * c-c++-common/Warray-bounds-3.c: Adjust text of expected warnings.
        * g++.dg/warn/Warray-bounds.C: Same.
        * g++.dg/warn/delete-array-1.C: Same.
        * gcc.dg/Walloc-size-larger-than-18.c: Convert pointers to integers
        to avoid expected warnings.
        * gcc.dg/Warray-bounds-29.c: Same.  Remove xfails.
        * gcc.dg/Warray-bounds-31.c: Same.  Add exoecred warnings.
        * gcc.dg/Warray-bounds-32.c: Add an expected warning.
        * gcc.dg/Warray-bounds-41.c: Remove xfails.
        * gcc.dg/Warray-bounds-46.c: Simplify test for warning.
        * gcc.dg/Warray-bounds-58.c: Add expected warnings.
        * gcc.dg/Warray-bounds-66.c: Adjust text of expected messages.
        * gcc.dg/Warray-bounds.c: Same.
        * gcc.dg/Wreturn-local-addr-8.c: Suppress -Wpointer-compare=2.
        * gcc.dg/Wstringop-overflow-23.c: Add expected warnings.
        * gcc.dg/attr-alloc_size-3.c: Convert pointers to integers to avoid
        expected warnings.
        * gcc.dg/attr-alloc_size-4.c: Same.
        * gcc.dg/attr-alloc_size-5.c: Same.
        * gcc.dg/pr83044.c: Add expected warning.
        * gcc.dg/strlenopt-57.c: Expect either kind of warning.
        * gcc.dg/tree-ssa/builtin-sprintf-warn-18.c: Suppress (otherwise
        expected) -Warray-bounds.
        * g++.dg/warn/Warray-bounds-14.C: New test.
        * g++.dg/warn/Wmismatched-new-delete-2.C: New test.
        * gcc.dg/Warray-bounds-69.c: New test.
        * gcc.dg/Warray-bounds-71.c: New test.
        * gcc.dg/Warray-bounds-72.c: New test.
        * gcc.dg/Warray-bounds-73.c: New test.
        * gcc.dg/Wpointer-compare-2.c: New test.
        * gcc.dg/Wpointer-compare-3.c: New test.
        * gcc.dg/Wpointer-compare-4.c: New test.
        * gcc.dg/Wpointer-compare.c: New test.
So the worry naturally is that we're late in the game.  But I think we
can reasonably go forward after a degree of testing.

I think you've got non-unique testnames in this patch.  Let's take one
example from Wpointer-compare-4.c:



+  // { dg-warning "subtracting pointers to distinct objects" "" { xfail *-*-* 
} .-1 }
+  // { dg-message "operand 1 is a null pointer constant" "" { xfail *-*-* } 
.-2 }

Remember, the testname is composed from the filename, the testname in
the dg-whatever (empty strings in both cases above" and the line number
(.-1 and .-2 above both ultimately refer to the same line).  The string
you're searching for does not factor into the testname.

It's possible this specific instance may be OK because of the dg-warning
vs dg-message difference.  But there's others that both use dg-message.

What I would recommend is reviewing the testsuite changes and looking at
each ".-" instance to ensure they're unique.

I didn't know you wanted the strings to be unique between different
directives too.  I'll update the tests, or more likely, try to find
a way to change the harness to ensure these things are unique.


@@ -224,6 +225,21 @@ access_ref::access_ref (tree bound /* = NULL_TREE */,
      }
  }
+/* Return the PARM_DECL that REF refers to or null if it doesn't. */
+
+tree
+access_ref::parm () const
+{
+  if (!ref || TREE_CODE (ref) != SSA_NAME)
+    return NULL;
+
+  gimple *def_stmt = SSA_NAME_DEF_STMT (ref);
+  if (!gimple_nop_p (def_stmt))
+    return NULL;
+
+  return SSA_NAME_VAR (ref);
+}
So I don't see anything in the code that ensures that REF is a
PARM_DECL.   Are these access_ref objects only created when we walk over
the function's PARM_DECLs?    I'm guessing that's the case and that I
just missed it.

Yes, I believe so.


Also note that convention is supposed to tag class members with a m_
prefix.  I don't think we're strict about that, but it would certainly
have helped me looking for how these things get set up.  Just searching
for "ref" generates too many unrelated hits.   Please try to use that
convention in the future.

@@ -320,6 +349,11 @@ access_ref::get_ref (vec<access_ref> *all_refs,
        if (phi_known_size && phi_arg_ref.sizrng[0] < minsize)
        minsize = phi_arg_ref.sizrng[0];
+ /* Disregard null pointers in PHIs with two or more arguments.
+        TODO: Handle this better!  */
+      if (nullp)
+       continue;
So a PHI with a single NULL argument is degenerate and should be
propagated away.  If one is still showing up, then I think returning the
equivalent of a "don't know" for all the analysis in here is probably
reasonable.

But what should the semantics be when there are multiple PHI arguments,
one or more of which are NULL?  Does it even make sense to do array
bounds checking in that case?  Presumably we're doing bounds checking
because either we're computing an address and trying to decide if its a
valid address or we're doing an actual memory reference and a NULL
pointer should be faulting in that case.

Compute_objsize conservatively punts on nulls to avoid warning
on things like memset(0, 0, n) in dead code.  That's overly
conservative for PHIs with more than one argument, so
PHI(null, ptr) is treated the same as compute_objsize(ptr).
If ptr is null it's a Don't Know, otherwise it's whatever ptr
points to.


And just so I understand.  It looks like if a PHI has multiple arguments
that use the argument with the most space remaining for analysis.  Given
two args with the same size, we take the larger one as a tie breaker.
That means we can miss warnings because we test the index against the
argument with the most remaining size or largest size, right?  I'm not
saying that's the wrong thing to do, I just want to make sure i
understand the mode of operation here.  In fact, it's probably the right
thing to do in terms of minimizing false positives.

Most of the PHI work is already on trunk.  You approved a separate
patch with it weeks ago.  My subsequent testing exposed some problems
with it that I fixed before committing, so there's probably some
overlap with this patch.

I mentioned on one of our calls I'd like to handle these cases better
by introducing some sort of predicate analysis.  Your suggestion was
to look at or reuse what the uninit pass does, so that's what I hope
to do for GCC 12.  Not just for this but for other warnings as well
(including the new -Wmismatched-dealloc).


On the subject of caching.  I had a conversation with Andrew on this
last week.  There's a reasonable chance he's going to be generalizing
Ranger's cache during gcc-12 and that we'll be able to use it as a
fairly generic caching mechanism.  So I wouldn't spend a ton of time on
adjusting/improving this implementation.   I'm assuming that your cache
is context insensitive, ie the values you're storing into it are global
values and not specific to a path or dominated subtree, right?

I don't plan to enhance the cache.  It's just a couple of vecs
with offsets and object sizes for pointer SSA_NAMEs.  If Andrew
expects to have something general-purpose that can be extended
to track all the data I keep track of I'll be happy to switch
to it.

@@ -5402,24 +5582,33 @@ compute_objsize_r (tree ptr, access_ref *pref, 
pointer_query *qry,
        return true;
      }
- if (code == VIEW_CONVERT_EXPR)
+  if (code == ASSERT_EXPR || code == VIEW_CONVERT_EXPR)
Are you really seeing ASSERT_EXPRs?  Those should only exist during VRP
and are generally to be avoided.  I'm not objecting to this, but mostly
want to make you aware that ASSERT_EXPRs are on their way out and if
you're relying heavily on them that's probably a good sign that we're
going to want to revisit this in a Ranger world.

This code is called from within VRP so yes, ASSERT_EXPRs are there.


Nit:
s/argumnent/argument/

For your builtins.c changes.

s/curret/current/
s/expressopns/expressions/

For your gimple-array-bounds.cc changes.



I'm going to ask that you not commit right now.  I'm going to allocate a
large fleet of testing machines (probably a few dozen) and get a
baseline test done as quickly as possible, then a second run with this
patch in its current state to get a sense of the fallout.  Most of the
fallout from the prereqs has already been addressed as well as other
bugs that were causing false positives -- so I'm hoping this patch
actually doesn't have a lot of fallout at this point.

You have been sending me reports from your testing with this patch
since October :)  You may recall it caused a couple of ICEs and
a false positive that I have fixes for in my tree.  The last time
we discussed it (privately) I said I'd wait to post my updates
until I heard more from you.  Let me do that now.  (Because of
the intervening changes I'm pretty sure the patch you have won't
apply anymore.)

Martin

Reply via email to