Re: PING: new pass to warn on questionable uses of alloca() and VLAs

Aldy Hernandez Fri, 19 Aug 2016 07:57:53 -0700

On 08/19/2016 05:35 AM, Aldy Hernandez wrote:

On 08/04/2016 12:37 PM, Jeff Law wrote:

On 07/27/2016 03:01 AM, Aldy Hernandez wrote:

Just in case this got lost in noise, since I know there was a lot of
back and forth between Martin Sebor and I.


This is the last iteration.

Tested on x86-64 Linux.

OK for trunk?

curr


gcc/

    * Makefile.in (OBJS): Add gimple-ssa-warn-walloca.o.
    * passes.def: Add two instances of pass_walloca.
    * tree-pass.h (make_pass_walloca): New.
    * gimple-ssa-warn-walloca.c: New file.
    * doc/invoke.texi: Document -Walloca, -Walloca-larger-than=, and
    -Wvla-larger-than= options.

gcc/c-family/

    * c.opt (Walloca): New.
    (Walloca-larger-than=): New.
    (Wvla-larger-than=): New.

As someone already noted, it's gimple-ssa-warn-alloca, not
gimple-ssa-warn-walloca for the ChangeLog entry.


Fixed.


On the nittish side, you're mixing C and C++ comment styles.  Choosing
one and sticking with it seems better :-)


Fixed.  Settled for C++ comments, except the copyright headers and the
testcases.


+@item -Walloca
+@opindex Wno-alloca
+@opindex Walloca
+This option warns on all uses of @code{alloca} in the source.
+
+@item -Walloca-larger-than=@var{n}
+This option warns on calls to @code{alloca} that are not bounded by a
+controlling predicate limiting its size to @var{n} bytes, or calls to
+@code{alloca} where the bound is unknown.

So for each of these little examples, I'd stuff the code into a trivial
function definition and make "n" a parameter.  That way it's obvious the
value of "n" comes from a context where we don't initially know its
range, but we may be able to narrow the range due to statements in the
function.


Done.

+
+class pass_walloca : public gimple_opt_pass
+{
+public:
+  pass_walloca (gcc::context *ctxt)
+    : gimple_opt_pass(pass_data_walloca, ctxt), first_time_p (false)
+  {}
+  opt_pass *clone () { return new pass_walloca (m_ctxt); }
+  void set_pass_param (unsigned int n, bool param)
+    {
+      gcc_assert (n == 0);
+      first_time_p = param;
+    }

ISTM that you're using "first_time_p" here, but in passes.def you refer
to this parameter as "strict_mode_p" in comments.

ie:

+      NEXT_PASS (pass_walloca, /*strict_mode_p=*/false);

I'd just drop the /*strict_mode_p*/ comment in both places it appears in
your patch's change to passes.def.  I think we've generally frowned on
those embedded comments, even though some have snuck in.


I've seen a lot of embedded comments throughout GCC, especially in
optional type arguments.  ISTM it makes things clearer for these
parameters.  But hey, I don't care that much.  Fixed.

+
+// We have a few heuristics up our sleeve to determine if a call to
+// alloca() is within bounds.  Try them out and return the type of
+// alloca call this is based on its argument.
+//
+// Given a known argument (ARG) to alloca() and an EDGE (E)
+// calculating said argument, verify that the last statement in the BB
+// in E->SRC is a gate comparing ARG to an acceptable bound for
+// alloca().  See examples below.
+//
+// MAX_SIZE is WARN_ALLOCA= adjusted for VLAs.  It is the maximum size
+// in bytes we allow for arg.
+//
+// If the alloca bound is determined to be too large, ASSUMED_LIMIT is
+// set to the bound used to determine this.  ASSUMED_LIMIT is only set
+// for ALLOCA_BOUND_MAYBE_LARGE and ALLOCA_BOUND_DEFINITELY_LARGE.
+//
+// Returns the alloca type.
+
+static enum alloca_type
+alloca_call_type_by_arg (tree arg, edge e, unsigned max_size,
+             wide_int *assumed_limit)

So I wonder if you ought to have a structure here for the return value
which contains the alloca type and assumed limit.  I know in the past we
avoided aggregate returns, but these days that doesn't seem necessary.
Seems cleaner than having a return value and output parameters.


Done, C++ style with a simple constructor :).

+{
+  // All the tests bellow depend on the jump being on the TRUE path.
+  if (!(e->flags & EDGE_TRUE_VALUE))
+    return ALLOCA_UNBOUNDED;

Seems like a fairly arbitrary and undesirable limitation.  Couldn't the
developer just have easily written

if (arg > N>
    x = malloc (...)
else
    x = alloca (...)

It also seems like you'd want to handle the set of LT/LE/GT/GE rather
than just LE.  Or is it the case that we always canonicalize LT into LE
by adjusting the constant (I vaguely remember running into that in RTL,
so it's entirely possible and there'd likely be a canonicalization of
GT/GE as well).


Most of it gets canonicalized, but your testcase is definitely possible,
so I fixed this.


It also seems that once Andrew's infrastructure is in place this becomes
dead code as we can just ask for the range at a point in the program,
including for each incoming edge.  You might want a comment to that
effect.


Done.

+
+  /* Check for:
+     if (arg .cond. LIMIT) -or- if (LIMIT .cond. arg)
+       alloca(arg);
+
+     Where LIMIT has a bound of unknown range.  */
+  tree limit = NULL;
+  if (gimple_cond_lhs (last) == arg)
+    limit = gimple_cond_rhs (last);
+  else if (gimple_cond_rhs (last) == arg)
+    limit = gimple_cond_lhs (last);
+  if (limit && TREE_CODE (limit) == SSA_NAME)
+    {
+      wide_int min, max;
+      value_range_type range_type = get_range_info (limit, &min, &max);
+      if (range_type == VR_UNDEFINED || range_type == VR_VARYING)
+    return ALLOCA_BOUND_UNKNOWN;
+      // FIXME: We could try harder here and handle a possible range
+      // or anti-range.  Hopefully the upcoming changes to range info
+      // will give us finer grained info, and we can avoid somersaults
+      // here.

Ah, can't you set *assumed_limit here?  It's just a matter of walking
through the cases and making the most conservative assumption.  So
assume the condition is LT, both objects are unsigned types and LIMIT
has a range like [5..25].  Then the resulting *assumed_limit must be 24.


Well, it turns out that we don't ever hit the FIXME path.  We either
handle limits we know with the previous code block (which gets
normalized to [arg .cond. LIMIT] always, or we handle the unknown path
with this block with [LIMIT .cond. arg].

I've simplified the code a bit, and updated the comments.  I also added
a gcc_unreachable() just in case we ever hit this path.  Though I've
verified at least building glibc and binutils that we never do.


ISTM it might also be worth checking VRP here -- I'd expect it to be
able to make this kind of determination.  that would be independent of
this work (in the sense that if VRP isn't creating ranges for this, it
should be fixed independently).

+    }
+
+  return ALLOCA_UNBOUNDED;
+}
+
+// Return TRUE if SSA's definition is a cast from a signed type.
+// If so, set *INVALID_CASTED_TYPE to the signed type.
+
+static bool
+cast_from_signed_p (tree ssa, tree *invalid_casted_type)
+{
+  gimple *def = SSA_NAME_DEF_STMT (ssa);
+  if (def
+      && !gimple_nop_p (def)
+      && gimple_assign_cast_p (def)
+      && !TYPE_UNSIGNED (TREE_TYPE (gimple_assign_rhs1 (def))))
+    {
+      *invalid_casted_type = TREE_TYPE (gimple_assign_rhs1 (def));
+      return true;
+    }
+  return false;

Note that we may have a cast from a signed type, but if the RHS of that
cast has a positive range, then the cast isn't going to case the
wrap-around effect that is so problematical.  That might help cut down
false positives.


Actually when the cast is from a known positive range we don't get a
VR_ANTI_RANGE, we get a proper VR_RANGE.

I've cleaned this code up a bit and merged some common conditionals.  In
the process, taking a subset of your advice, and cleaning up some things
I've managed to handle 2 cases where I was previously XFAILing.
So...less false positives.  More coverage.  Woo hoo!

+}
+
+// Return TURE if X has a maximum range of MAX, basically covering the
+// entire domain, in which case it's no range at all.

s/TURE/TRUE/


Fixed.

+
+static bool
+is_max (tree x, wide_int max)
+{
+  return wi::max_value (TREE_TYPE (x)) == max;
+}

I'm a bit surprised we don't have this kind of utility function
elsewhere.   I wonder if it'd be better to conceptualize this as a range
query since it looks like you're always using get_range_info to get an
object's range, then comparing what that returns to the maximal value of
hte object's type.  Maybe that's too much bikeshedding...


*shrug*.  If you feel strongly, I can look into this, but I'm inherently
lazy :).

+
+// Analyze the alloca call in STMT and return an `enum alloca_type'
+// explaining what type of alloca it is.  IS_VLA is set if the alloca
+// call is really a BUILT_IN_ALLOCA_WITH_ALIGN, signifying a VLA.
+//
+// If the alloca bound is determined to be too large, ASSUMED_LIMIT is
+// set to the bound used to determine this.  ASSUMED_LIMIT is only set
+// for ALLOCA_BOUND_MAYBE_LARGE and ALLOCA_BOUND_DEFINITELY_LARGE.
+//
+// If the alloca call may be too large because of a cast from a signed
+// type to an unsigned type, set *INVALID_CASTED_TYPE to the
+// problematic signed type.
+
+static enum alloca_type
+alloca_call_type (gimple *stmt, bool is_vla, wide_int *assumed_limit,
+          tree *invalid_casted_type)

Again, consider an aggregate return.


Done.

+{
+  gcc_assert (gimple_alloca_call_p (stmt));
+  tree len = gimple_call_arg (stmt, 0);
+  enum alloca_type w = ALLOCA_UNBOUNDED;
+  wide_int min, max;
+
+  gcc_assert (!is_vla || warn_vla_limit > 0);
+  gcc_assert (is_vla || warn_alloca_limit > 0);
+
+  // Adjust warn_alloca_max_size for VLAs, by taking the underlying
+  // type into account.
+  unsigned HOST_WIDE_INT max_size;
+  if (is_vla)
+    max_size = (unsigned HOST_WIDE_INT) warn_vla_limit;
+  else
+    max_size = (unsigned HOST_WIDE_INT) warn_alloca_limit;
+
+  // Check for the obviously bounded case.
+  if (TREE_CODE (len) == INTEGER_CST)
+    {
+      if (tree_to_uhwi (len) > max_size)
+    {
+      *assumed_limit = len;
+      return ALLOCA_BOUND_DEFINITELY_LARGE;
+    }
+      if (integer_zerop (len))
+    return ALLOCA_ARG_IS_ZERO;
+      w = ALLOCA_OK;
+    }
+  else if (TREE_CODE (len) != SSA_NAME)
+    return ALLOCA_UNBOUNDED;

Hmm, other than INTEGER_CST and SSA_NAME, is there any other nodes we
can get here?   Perhaps we get DECLs and such, particularly when not
optimizing?!?


Nope.  We don't even run without optimization (because we need VRP/range
info).  I added a gcc_unreachable() to make sure and added an
appropriate comment.  It doesn't get triggered on tests or
glibc/binutils builds.

+  // Check the range info if available.
+  else
+    {
+      if (value_range_type range_type = get_range_info (len, &min,
&max))
+    {
+      if (range_type == VR_RANGE)
+        {
+          if (wi::leu_p (max, max_size))
+        w = ALLOCA_OK;
+          else if (is_max (len, max))
+        {
+          // A cast may have created a range we don't care
+          // about.  For instance, a cast from 16-bit to
+          // 32-bit creates a range of 0..65535, even if there
+          // is not really a determinable range in the
+          // underlying code.  In this case, look through the
+          // cast at the original argument, and fall through
+          // to look at other alternatives.
+          gimple *def = SSA_NAME_DEF_STMT (len);
+          if (gimple_assign_cast_p (def))
+            len = gimple_assign_rhs1 (def);
+        }
+          else
+        {
+          /* If `len' is merely a cast that is being
+             calculated right before the call to alloca, look
+             at the range for the original value.

Yea, this is similar to my comment earlier that the RHS of the cast may
have a known range (non-negative) that allows us to not worry about the
cast creating a huge integer.  I think you can add handling that case
here without too much trouble.  Though you might consider pulling all
the casting bits into a separate function.


See previous comments.  I've merged and cleaned this up.

+
+             This avoids the cast creating a range where the
+             original expression did not have a range:
+
+             # RANGE [0, 18446744073709551615] NONZERO 4294967295
+             _2 = (long unsigned int) n_7(D);
+             p_9 = __builtin_alloca (_2);

Note this example would make more sense if the type of n_7 was explicit.


Merged and removed.

+      else if (range_type == VR_ANTI_RANGE)
+        {
+          // There may be some wrapping around going on.  Catch it
+          // with this heuristic.  Hopefully, this VR_ANTI_RANGE
+          // nonsense will go away, and we won't have to catch the
+          // sign conversion problems with this crap.
+          if (cast_from_signed_p (len, invalid_casted_type))
+        return ALLOCA_CAST_FROM_SIGNED;

Another place where casting from an object with a non-negative range
ought to be filtered out as not problematical.


Same.

+
+  // If we couldn't find anything, try a few heuristics for things we
+  // can easily determine.  Check these misc cases but only accept
+  // them if all predecessors have a known bound.
+  basic_block bb = gimple_bb (stmt);
+  if (w == ALLOCA_UNBOUNDED)
+    {
+      w = ALLOCA_OK;
+      for (unsigned ix = 0; ix < EDGE_COUNT (bb->preds); ix++)
+    {
+      enum alloca_type w
+        = alloca_call_type_by_arg (len, EDGE_PRED (bb, ix), max_size,
+                       assumed_limit);
+      if (w != ALLOCA_OK)
+        return w;
+    }

So this is an interesting tidbit that answers my questions about how we
use alloca_call_type_by_arg.  Essentially this is meant to catch the
merge point for flow control and give a conservative warning.  That's
fine and good.  But ISTM it's really a bit of a hack.  What if we have
something like this:

   X   Y   Z
    \  |  /
      \|/
       A
      / \
     /   \
    B     C
         / \
        /   \
       D     E


Where the alloca call is in E and the incoming edges to A actually have
useful information about the argument to the alloca call.

ISTM you need to be doing something with the dominator tree here to find
the merge point(s)  where we might know something useful.  And it's this
kind of test that makes me wonder about re-purposing some of the path
analysis code from tree-ssa-uninit.c.  It may be the case that the path
Z->A->C->E is unfeasible, but left in the CFG because duplication to
expose the unfeasible path was unprofitable.  If it turns out that the
only argument that causes problems comes from the edge Z->A, then we
wouldn't want to warn in this case.

  I don't see Andrew's work necessarily being able to solve this problem.


In my limited testing I've seen that 95% of all cases (I'm pulling this
number out of thin air ;-)) are relatively simple.  Just looking at the
definition of the SSA name in the alloca() call or the immediate
predecessors yields most of the information we need.  I haven't seen
much more complicated things with an actual range.

So I am hesitant to complicate things for something that doesn't seem as
likely to happen.  Perhaps, as a follow-up if it happens in the wild?

However, if you feel strongly about this, I will finally get around to
reading tree-ssa-uninit.c and getting down to business :).

+    }
+
+  return w;
+}
+
+// Return TRUE if the alloca call in STMT is in a loop, otherwise
+// return FALSE. As an exception, ignore alloca calls for VLAs that
+// occur in a loop since those will be cleaned up when they go out of
+// scope.
+
+static bool
+in_loop_p (bool is_vla, gimple *stmt)
+{
+  basic_block bb = gimple_bb (stmt);
+  if (bb->loop_father
+      // ?? Functions with no loops get a loop_father?  I
+      // don't get it.  The following conditional seems to do
+      // the trick to exclude such nonsense.
+      && bb->loop_father->header != ENTRY_BLOCK_PTR_FOR_FN (cfun))

I believe there is a "loop" that encompasses the whole function.


Remove clueless comment.

Phew.  That took longer than expected.

Regstrapped on x86-64 Linux and the resulting compiler was verified by
building glibc and binutils.

Aldy

Re: PING: new pass to warn on questionable uses of alloca() and VLAs

Reply via email to