On 6/3/19 5:24 PM, Martin Sebor wrote:
On 5/31/19 2:46 PM, Jeff Law wrote:
On 5/22/19 3:34 PM, Martin Sebor wrote:
-Wreturn-local-addr detects a subset of instances of returning
the address of a local object from a function but the warning
doesn't try to handle alloca or VLAs, or some non-trivial cases
of ordinary automatic variables[1].

The attached patch extends the implementation of the warning to
detect those.  It still doesn't detect instances where the address
is the result of a built-in such strcpy[2].

Tested on x86_64-linux.

Martin

[1] For example, this is only diagnosed with the patch:

   void* f (int i)
   {
     struct S { int a[2]; } s[2];
     return &s->a[i];
   }

[2] The following is not diagnosed even with the patch:

   void sink (void*);

   void* f (int i)
   {
     char a[6];
     char *p = __builtin_strcpy (a, "123");
     sink (p);
     return p;
   }

I would expect detecting to be possible and useful.  Maybe as
a follow-up.

gcc-71924.diff

PR middle-end/71924 - missing -Wreturn-local-addr returning alloca result PR middle-end/90549 - missing -Wreturn-local-addr maybe returning an address of a local array plus offset

gcc/ChangeLog:

    PR c/71924
    * gimple-ssa-isolate-paths.c (is_addr_local): New function.
    (warn_return_addr_local_phi_arg, warn_return_addr_local): Same.
    (find_implicit_erroneous_behavior): Call warn_return_addr_local_phi_arg.
    (find_explicit_erroneous_behavior): Call warn_return_addr_local.

gcc/testsuite/ChangeLog:

    PR c/71924
    * gcc.dg/Wreturn-local-addr-2.c: New test.
    * gcc.dg/Walloca-4.c: Prune expected warnings.
    * gcc.dg/pr41551.c: Same.
    * gcc.dg/pr59523.c: Same.
    * gcc.dg/tree-ssa/pr88775-2.c: Same.
    * gcc.dg/winline-7.c: Same.

diff --git a/gcc/gimple-ssa-isolate-paths.c b/gcc/gimple-ssa-isolate-paths.c
index 33fe352bb23..2933ecf502e 100644
--- a/gcc/gimple-ssa-isolate-paths.c
+++ b/gcc/gimple-ssa-isolate-paths.c
@@ -341,6 +341,135 @@ stmt_uses_0_or_null_in_undefined_way (gimple *stmt)
    return false;
  }
+/* Return true if EXPR is a expression of pointer type that refers
+   to the address of a variable with automatic storage duration.
+   If so, set *PLOC to the location of the object or the call that
+   allocated it (for alloca and VLAs).  When PMAYBE is non-null,
+   also consider PHI statements and set *PMAYBE when some but not
+   all arguments of such statements refer to local variables, and
+   to clear it otherwise.  */
+
+static bool
+is_addr_local (tree exp, location_t *ploc, bool *pmaybe = NULL,
+           hash_set<gphi *> *visited = NULL)
+{
+  if (TREE_CODE (exp) == SSA_NAME)
+    {
+      gimple *def_stmt = SSA_NAME_DEF_STMT (exp);
+      enum gimple_code code = gimple_code (def_stmt);
+
+      if (is_gimple_assign (def_stmt))
+    {
+      tree type = TREE_TYPE (gimple_assign_lhs (def_stmt));
+      if (POINTER_TYPE_P (type))
+        {
+          tree ptr = gimple_assign_rhs1 (def_stmt);
+          return is_addr_local (ptr, ploc, pmaybe, visited);
+        }
+      return false;
+    }
So this is going to recurse on the rhs1 of something like
POINTER_PLUS_EXPR, that's a good thing :-)   But isn't it non-selective
about the codes where we recurse?

Consider

   ptr = (cond) ? res1 : res2

I think we'll end up recursing on the condition rather than looking at
res1 and res2.


I suspect there are a very limited number of expression codes that
appear on the RHS where we'd want to recurse on one or both operands.

POINTER_PLUS_EXPR, NOP_EXPR, maybe COND_EXPR (where you have to recurse
on both and logically and the result), BIT_AND (maybe we masked off some
bits in an address).  That's probably about it :-)

Are there any other codes you've seen or think would be useful in
practice to recurse through?  I'd rather list them explicitly rather
than just recurse down through every rhs1 we encounter.

I don't have a list of codes to test for.  I initially contemplated
enumerating them but in the end decided the pointer type check would
be sufficient.  I wouldn't expect a COND_EXPR here.  Don't they get
transformed into PHIs?  In all my tests they do and and running
the whole test suite with an assert that it doesn't come up doesn't
expose any either.  (I left the assert for COND_EXPR there.)  If
a COND_EXPR really can come up in a GIMPLE assignment here can you
please show me how so I can add a test for it?

I've added tests to exercise all C expressions that evaluate to
pointers.  I don't know of any others where what you bring up
should be a concern and I don't want to try to hardwire tests for
any that I can't to exercise in the testsuite or don't know how.
If you know of some I'm happy to add them and adjust the code.

+
+      if (code == GIMPLE_PHI && pmaybe)
+    {
+      unsigned count = 0;
+      gphi *phi_stmt = as_a <gphi *> (def_stmt);
+
+      unsigned nargs = gimple_phi_num_args (phi_stmt);
+      for (unsigned i = 0; i < nargs; ++i)
+        {
+          if (!visited->add (phi_stmt))
+        {
+          tree arg = gimple_phi_arg_def (phi_stmt, i);
+          if (is_addr_local (arg, ploc, pmaybe, visited))
+            ++count;
+        }
+        }
So imagine

p = phi (x1, x1, x2)

Where both x1 and x2 would pass is_addr_local.  I think this code would
incorrectly set pmaybe.

I suppose it would but this happens readily with or without my
patch, for example here:

   int* f (int i)
   {
     int a[2], b[2];
     int *p = i ? a : b;
     return p;
   }

We end up with two instances of "function may return address
of local variable," one for each local, because
find_implicit_erroneous_behavior only issues the "may return"
kind of warning.

I don't particularly like this -- that's why I added PMAYBE to
the new code.  But to avoid mission creep I'd decided not draw
the line there and avoid trying to fix the problem completely.
(I've enhanced this in the attached update.)


We process the first instance of x1, find it is local and bump count.
When we encounter the second instance, it's in our map and we do
nothing.  THen we process x2 and bump count again.  So count would be 2.
  But count is going to be less than nargs so we'll set *pmaybe to true.

ISTM you'd have to record the result for each phi argument and query
that to determine if you should bump the counter if you've already
visited the argument.

I suspect that no solution will be perfect, at a minimum because
multiple return statements sometimes get merged into one, so what
in the source code is a single return that definitely returns
the address a local may end up merged with one that returns
a null (and triggering a maybe kind of warning).  I have also
seen warnings in some non-trivial test cases that suggest that
some return statements get split up into two where a "may return"
could turn into a "definitely does return."

It also seems to me that you can break the loop as soon as you've got a
nonzero count and get a false return from is_addr_local.   Not sure if
that'll matter in practice or not.

This is only possible if we're willing to lose some detail (i.e.,
if we are willing to only point to one variable when returning
the address of two or more).  I don't find that very user-friendly.

To wrap up, while I would have preferred taking a simpler approach
at first and making bigger design changes later, I have redesigned
the warning for better accuracy as you suggested above.

The attached revised patch first collects the return statements
in a hash table along with the locations of the local variables
whose addresses each statement returns, and then issues just one
warning for each statement, listing all the locals it refers to
in subsequent notes.

In addition, since it was nearly trivial, I also added checking
for returning addresses of locals via calls to built-ins like
memcpy.

I didn't fully retest the patch after a minor tweak and sure enough,
the tweak was wrong.  Please apply the following on top of the patch.
(We only want to consider the argument of the call.  The LHS is
the call itself.)

diff --git a/gcc/gimple-ssa-isolate-paths.c b/gcc/gimple-ssa-isolate-paths.c
index eb8e754870f..bb9616f26ec 100644
--- a/gcc/gimple-ssa-isolate-paths.c
+++ b/gcc/gimple-ssa-isolate-paths.c
@@ -475,14 +475,9 @@ is_addr_local (gimple *return_stmt, tree exp, location_t *ploc,
            case BUILT_IN_STRNCAT_CHK:
            case BUILT_IN_STRNCPY:
            case BUILT_IN_STRNCPY_CHK:
-             /* Check both the argument and the return value.  */
-             return (is_addr_local (return_stmt,
-                                    gimple_call_arg (def_stmt, 0),
-                                    ploc, plocmap, visited)
-                     || is_addr_local (return_stmt,
-                                       gimple_call_lhs (def_stmt),
-                                       ploc, plocmap, visited));
-
+             return is_addr_local (return_stmt,
+                                   gimple_call_arg (def_stmt, 0),
+                                   ploc, plocmap, visited);
            default:
              return false;
            }

I have beefed up the test suite to exercise non-trivial functions
involving different kinds of expressions and statements, including
loops.  Except for one xfail due to missing location information
(bug 90735 - missing location in -Wreturn-local-addr on a function
with two return statements​) I haven't found any issues.
The improved warning does find many more problems than the current
solution.

The other approach here (and I'm not suggesting you implement it) would
be to use the propagation engine.  That's probably overkill here since
we'd probably end up computing a whole lot of things we don't need.  My
sense is an on-demand approach like you've done is probably less
computationally expensive.

I'm certainly not opposed to making further improvements (as I
mentioned, I'd like to add checking for returning freed pointers
and freeing locals as you suggested).  But I would prefer to make
these in incremental steps rather than growing what was at first
meant to be just small bug fix/enhancement for alloca and VLAs.

The attached update has been tested on x86_64-linux.

Martin

Reply via email to