On 12/07/2017 03:23 PM, Jeff Law wrote:
On 11/29/2017 04:36 PM, Martin Sebor wrote:
I've finished reimplementing the patch as a standalone pass.
In the attached revision I also addressed your comments below
as well as Richard's to allowing the strlen optimizations even
for overlapping accesses.

While beefing up the tests I found a few minor issues that
I also fixed (false negatives).

The fallout wasn't quite as bad as I thought, mainly thanks
to the narrow API for the checker.

Syncing up with the latest trunk has led to some more changes
in tree-ssa-strlen.

I've retested the patch with GDB and Glibc with the same results
as before.

The patch seems sizable (over 3KLOC without tests) but it's worth
noting that most of the complexity is actually not in determining
whether or not an overlap exists (that's quite simple) but rather
in computing its offset and size to mention in the warnings and
making sure the information is meaningful to the user even when
ranges are involved.  All the subtly different forms of warnings
also contribute substantially to the overall size.

Martin
[ Huge snip. ]


gcc-78918.diff


PR tree-optimization/78918 - missing -Wrestrict on memcpy copying over self

gcc/c-family/ChangeLog:

        PR tree-optimization/78918
        * c-common.c (check_function_restrict): Avoid checking built-ins.
        * c.opt (-Wrestrict): Include in -Wall.

gcc/ChangeLog:

        PR tree-optimization/78918
        * Makefile.in (OBJS): Add gimple-ssa-warn-restrict.o.
        * builtins.c (check_sizes): Rename...
        (check_access): ...to this.  Rename function arguments for clarity.
        (check_memop_sizes): Adjust names.
        (expand_builtin_memchr, expand_builtin_memcpy): Same.
        (expand_builtin_memmove, expand_builtin_mempcpy): Same.
        (expand_builtin_strcat, expand_builtin_stpncpy): Same.
        (check_strncat_sizes, expand_builtin_strncat): Same.
        (expand_builtin_strncpy, expand_builtin_memset): Same.
        (expand_builtin_bzero, expand_builtin_memcmp): Same.
        (expand_builtin_memory_chk, maybe_emit_chk_warning): Same.
        (maybe_emit_sprintf_chk_warning): Same.
        (expand_builtin_strcpy): Adjust.
        (expand_builtin_stpcpy): Same.
        (expand_builtin_with_bounds): Detect out-of-bounds accesses
        in pointer-checking forms of memcpy, memmove, and mempcpy.
        (gcall_to_tree_minimal, max_object_size): Define new functions.
        * builtins.h (max_object_size): Declare.
        * calls.c (alloc_max_size): Call max_object_size instead of
        hardcoding ssizetype limit.
        (get_size_range): Handle new argument.
        * calls.h (get_size_range): Add a new argument.
        * cfgexpand.c (expand_call_stmt): Propagate no-warning bit.
        * doc/invoke.texi (-Wrestrict): Adjust, add example.
        * gimple-fold.c (gimple_fold_builtin_memory_op): Detect overlapping
        operations.
        (gimple_fold_builtin_memory_chk): Same.
        (gimple_fold_builtin_stxcpy_chk): New function.
        * gimple-ssa-warn-restrict.c: New source.
        * gimple-ssa-warn-restrict.h: New header.
        * gimple.c (gimple_build_call_from_tree): Propagate location.
        * passes.def (pass_warn_restrict): Add new pass.
        * tree-pass.h (make_pass_warn_restrict): Declare.
        * tree-ssa-strlen.c (handle_builtin_strcpy): Detect overlapping
        operations.
        (handle_builtin_strcat): Same.
        (strlen_optimize_stmt): Rename...
        (strlen_check_and_optimize_stmt): ...to this.  Handle strncat,
        stpncpy, strncpy, and their checking forms.

gcc/testsuite/ChangeLog:

        PR tree-optimization/78918
        * c-c++-common/Warray-bounds.c: New test.
        * c-c++-common/Warray-bounds-2.c: New test.
        * c-c++-common/Warray-bounds-3.c: New test.
        * c-c++-common/Wrestrict-2.c: New test.
        * c-c++-common/Wrestrict.c: New test.
        * c-c++-common/Wrestrict.s: New test.
        * c-c++-common/Wsizeof-pointer-memaccess1.c: Adjust
        * c-c++-common/Wsizeof-pointer-memaccess2.c: Same.
        * g++.dg/torture/Wsizeof-pointer-memaccess1.C: Same.
        * g++.dg/torture/Wsizeof-pointer-memaccess2.C: Same.
        * gcc.dg/memcpy-6.c: New test.
        * gcc.dg/pr69172.c: Adjust.
        * gcc.dg/pr79223.c: Same.
        * gcc.dg/Wrestrict-2.c: New test.
        * gcc.dg/Wrestrict.c: New test.
        * gcc.dg/Wsizeof-pointer-memaccess1.c
        * gcc.target/i386/chkp-stropt-17.c: New test.
        * gcc.dg/torture/Wsizeof-pointer-memaccess1.c: Adjust.

@@ -3874,32 +3885,32 @@ check_strncat_sizes (tree exp, tree objsize)
                                size_one_node)
                 : NULL_TREE);

-  /* Strncat copies at most MAXLEN bytes and always appends the terminating
+  /* Strncat copies at most MAXREAD bytes and always appends the terminating
Nit.  Use "strncat" rather than "Strncat", even when starting a
sentence.  I saw this elsewhere.  You can fix these in a follow-up which
you can consider pre-approved.

So please correct me if I'm wrong.  WRT overall structure, you've got a
pass which walks the IL looking for suitable calls and potentially warns.

Additionally, that pass exports check_bounds_or_overlap which we use to
varying degress in the folder and strlen optimization pass.  It doesn't
really query any pass specific state, it's just an exported interface
into the meat of your optimization pass.

Right.  The pass has no state.  It had none when it was in
builtins.c and it still has none now that it's its own pass.
There may be an opportunity in the future to start tracking
things like VLA sizes which would mean adding state to it.


Presumably you have those exported interfaces so that you can diagnose
things prior to folding or other transformations that potentially
obfuscate the calls you want to diagnose?  Right?

Right.


The main pass (which runs after VRP) presumably runs to capture more
stuff that is exposed by the optimizers.  ie, the optimizers may
const/copy propagate things which are necessary to expose some of the
cases we want to capture.

Right.


Assuming I'm correct on those issues, this starts to look a lot like how
we structure early/late warnings.

--


I believe standard practice is to put the pass class into the anonymous
namespace -- pass_data_wstrict & pass_wrestrict as well as any methods
you define for them.  I'm sure there's a good reason why we do that,
though I find it annoying given GDB's poor support for anonymous
namespaces.  BUt please follow convention here.

Sure.  IME, the anonymous namespaces makes setting breakpoints
difficult (e.g., in the tree-ssa-sprintf.c pass).  I can't just
use the break command followed by the name of a function defined
in such a namespace.  But it's not a big deal and I don't mind
wrapping things in an anonymous namespace here.


Consider walking the blocks in dominator order.  If the pass can benefit
from range data it'll make adding the evrp analyzer easier.  As a
reference see how I changed the sprintf warnings bits to use a dominator
walk order.

I remember you did that for the sprintf pass but since you'd
mentioned some runtime overhead and there was no apparent benefit
in this case I used a plain gimple pass.  But I'm fine changing
it if you're not concerned about the overhead and think it might
be helpful in the near future.


On pass_wrestrict::check_call, put its return value type onto a separate
line.


Ack.

It looks like you have to create an expression for each and every
builtin call statement you check.  Is there any way to defer creating
that CALL_EXPR until you know you need it in a diagnostic?  Can the uses
prior to the actual diagnostic get their data from elsewhere?

In builtins.c the CALL_EXPR was only created for callers outside
builtins.c (i.e., gimple-fold.c and tree-ssa-strlen.c).  Otherwise
the machinery worked with CALL_EXPR which is what its callers
within the file already had.

With the move to a gimple pass this has changed.  There are no
callers that work with CALL_EXPR anymore, they all work with
gimple statements.  So the CALL_EXPR interface is unnecessary
and can and probably should be replaced with a gimple*.  It
didn't occur to me to do it when I was scrambling to move
the code and get the prototype working, and then to get it
finished.  Let me look into making the change now.


It looks like you're doing a lot more than just restrict checking in the
new pass.  I see array bounds checking and what appears to be general
pointer-out-of-bounds checking in the new pass. Can you comment on that?

Detecting overlap and out-of-bounds accesses are closely related.
For example:

  void f (void)
  {
    char a[9] = "01234";

    if (i < 3 || i > 7)
      i = 3;

    char *d = a + i;   // i's range is [3, 7]

    const char *s = a + 4;

    strncpy (d, s, 4);
  }

If i > 5 the copy would overflow (form an out-of-bounds pointer)
and so -Wrestrict assumes i <= 5 holds, which means the copy
overlaps because the source points to the trailing "4\0" and
the destination to the NUL at a[5].  I.e., -Wrestrict can't
just consider overlap, it must also consider object bounds.

Other more obscure cases involve arrays of unknown bounds (whose
lower bound must be zero and upper bound PTRDIFF_MAX) and pointers
to unknown objects where (whose bounds are [PTRDIFF_MIN,
PTRDIFF_MAX]).  -Wrestrict has to check the bounds anyway so if
the size of a copy into the same object is large enough that it
would overflow its bounds, it makes sense to diagnose it using
-Warray-bounds.  If it doesn't overflow the bounds but does
overlap then the checker issues -Wrestrict.

Overall it looks pretty good.  I want to make sure I understand the
overall structure and reasoning behind that structure.

Sure.  The structure of the pass is simple.  It iterates over
calls to built-ins, checking each pointer argument to make sure
it's valid both before and after the access, and finally checking
the access for overlaps.

The complexity (in my mind) is in computing the offsets and sizes
of the overlap when it happens.

Martin

Reply via email to