On 1/10/22 16:36, David Malcolm via Gcc-patches wrote:
On Thu, 2022-01-06 at 09:08 -0500, David Malcolm wrote:
On Sat, 2021-11-13 at 15:37 -0500, David Malcolm wrote:
This patch adds a new __attribute__ ((tainted)) to the C/C++
frontends.

Ping for GCC C/C++ mantainers for review of the C/C++ FE parts of this
patch (attribute registration, documentation, the name of the
attribute, etc).

(I believe it's independent of the rest of the patch kit, in that it
could go into trunk without needing the prior patches)

Thanks
Dave

Getting close to end of stage 3 for GCC 12, so pinging this patch
again...

https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584376.html

The c-family change is OK.

Thanks
Dave




It can be used on function decls: the analyzer will treat as tainted
all parameters to the function and all buffers pointed to by
parameters
to the function.  Adding this in one place to the Linux kernel's
__SYSCALL_DEFINEx macro allows the analyzer to treat all syscalls as
having tainted inputs.  This gives additional testing beyond e.g.
__user
pointers added by earlier patches - an example of the use of this can
be
seen in CVE-2011-2210, where given:

  SYSCALL_DEFINE5(osf_getsysinfo, unsigned long, op, void __user *,
buffer,
                  unsigned long, nbytes, int __user *, start, void
__user *, arg)

the analyzer will treat the nbytes param as under attacker control,
and
can complain accordingly:

taint-CVE-2011-2210-1.c: In function ‘sys_osf_getsysinfo’:
taint-CVE-2011-2210-1.c:69:21: warning: use of attacker-controlled
value
   ‘nbytes’ as size without upper-bounds checking [CWE-129] [-
Wanalyzer-tainted-size]
    69 |                 if (copy_to_user(buffer, hwrpb, nbytes) != 0)
       |                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Additionally, the patch allows the attribute to be used on field
decls:
specifically function pointers.  Any function used as an initializer
for such a field gets treated as tainted.  An example can be seen in
CVE-2020-13143, where adding __attribute__((tainted)) to the "store"
callback of configfs_attribute:

   struct configfs_attribute {
      /* [...snip...] */
      ssize_t (*store)(struct config_item *, const char *, size_t)
        __attribute__((tainted));
      /* [...snip...] */
   };

allows the analyzer to see:

  CONFIGFS_ATTR(gadget_dev_desc_, UDC);

and treat gadget_dev_desc_UDC_store as tainted, so that it complains:

taint-CVE-2020-13143-1.c: In function ‘gadget_dev_desc_UDC_store’:
taint-CVE-2020-13143-1.c:33:17: warning: use of attacker-controlled
value
   ‘len + 18446744073709551615’ as offset without upper-bounds
checking [CWE-823] [-Wanalyzer-tainted-offset]
    33 |         if (name[len - 1] == '\n')
       |             ~~~~^~~~~~~~~

Similarly, the attribute could be used on the ioctl callback field,
USB device callbacks, network-handling callbacks etc.  This
potentially
gives a lot of test coverage with relatively little code annotation,
and
without necessarily needing link-time analysis (which -fanalyzer can
only do at present on trivial examples).

I believe this is the first time we've had an attribute on a field.
If that's an issue, I could prepare a version of the patch that
merely allowed it on functions themselves.

As before this currently still needs -fanalyzer-checker=taint (in
addition to -fanalyzer).

gcc/analyzer/ChangeLog:
         * engine.cc: Include "stringpool.h", "attribs.h", and
         "tree-dfa.h".
         (mark_params_as_tainted): New.
         (class tainted_function_custom_event): New.
         (class tainted_function_info): New.
         (exploded_graph::add_function_entry): Handle functions with
         "tainted" attribute.
         (class tainted_field_custom_event): New.
         (class tainted_callback_custom_event): New.
         (class tainted_call_info): New.
         (add_tainted_callback): New.
         (add_any_callbacks): New.
         (exploded_graph::build_initial_worklist): Find callbacks that
are
         reachable from global initializers, calling add_any_callbacks
on
         them.

gcc/c-family/ChangeLog:
         * c-attribs.c (c_common_attribute_table): Add "tainted".
         (handle_tainted_attribute): New.

gcc/ChangeLog:
         * doc/extend.texi (Function Attributes): Note that "tainted"
can
         be used on field decls.
         (Common Function Attributes): Add entry on "tainted"
attribute.

gcc/testsuite/ChangeLog:
         * gcc.dg/analyzer/attr-tainted-1.c: New test.
         * gcc.dg/analyzer/attr-tainted-misuses.c: New test.
         * gcc.dg/analyzer/taint-CVE-2011-2210-1.c: New test.
         * gcc.dg/analyzer/taint-CVE-2020-13143-1.c: New test.
         * gcc.dg/analyzer/taint-CVE-2020-13143-2.c: New test.
         * gcc.dg/analyzer/taint-CVE-2020-13143.h: New test.
         * gcc.dg/analyzer/taint-alloc-3.c: New test.
         * gcc.dg/analyzer/taint-alloc-4.c: New test.

Signed-off-by: David Malcolm <dmalc...@redhat.com>
---
  gcc/analyzer/engine.cc                        | 317
+++++++++++++++++-
  gcc/c-family/c-attribs.c                      |  36 ++
  gcc/doc/extend.texi                           |  22 +-
  .../gcc.dg/analyzer/attr-tainted-1.c          |  88 +++++
  .../gcc.dg/analyzer/attr-tainted-misuses.c    |   6 +
  .../gcc.dg/analyzer/taint-CVE-2011-2210-1.c   |  93 +++++
  .../gcc.dg/analyzer/taint-CVE-2020-13143-1.c  |  38 +++
  .../gcc.dg/analyzer/taint-CVE-2020-13143-2.c  |  32 ++
  .../gcc.dg/analyzer/taint-CVE-2020-13143.h    |  91 +++++
  gcc/testsuite/gcc.dg/analyzer/taint-alloc-3.c |  21 ++
  gcc/testsuite/gcc.dg/analyzer/taint-alloc-4.c |  31 ++
  11 files changed, 772 insertions(+), 3 deletions(-)
  create mode 100644 gcc/testsuite/gcc.dg/analyzer/attr-tainted-1.c
  create mode 100644 gcc/testsuite/gcc.dg/analyzer/attr-tainted-
misuses.c
  create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-CVE-2011-
2210-1.c
  create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-
13143-1.c
  create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-
13143-2.c
  create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-
13143.h
  create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-alloc-3.c
  create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-alloc-4.c

diff --git a/gcc/analyzer/engine.cc b/gcc/analyzer/engine.cc
index 096e219392d..5fab41daf93 100644
--- a/gcc/analyzer/engine.cc
+++ b/gcc/analyzer/engine.cc
@@ -68,6 +68,9 @@ along with GCC; see the file COPYING3.  If not see
  #include "plugin.h"
  #include "target.h"
  #include <memory>
+#include "stringpool.h"
+#include "attribs.h"
+#include "tree-dfa.h"
 /* For an overview, see gcc/doc/analyzer.texi.  */ @@ -2276,6 +2279,116 @@ exploded_graph::~exploded_graph ()
      delete (*iter).second;
  }
+/* Subroutine for use when implementing __attribute__((tainted))
+   on functions and on function pointer fields in structs.
+
+   Called on STATE representing a call to FNDECL.
+   Mark all params of FNDECL in STATE as "tainted".  Mark the value
of all
+   regions pointed to by params of FNDECL as "tainted".
+
+   Return true if successful; return false if the "taint" state
machine
+   was not found.  */
+
+static bool
+mark_params_as_tainted (program_state *state, tree fndecl,
+                       const extrinsic_state &ext_state)
+{
+  unsigned taint_sm_idx;
+  if (!ext_state.get_sm_idx_by_name ("taint", &taint_sm_idx))
+    return false;
+  sm_state_map *smap = state->m_checker_states[taint_sm_idx];
+
+  const state_machine &sm = ext_state.get_sm (taint_sm_idx);
+  state_machine::state_t tainted = sm.get_state_by_name ("tainted");
+
+  region_model_manager *mgr = ext_state.get_model_manager ();
+
+  function *fun = DECL_STRUCT_FUNCTION (fndecl);
+  gcc_assert (fun);
+
+  for (tree iter_parm = DECL_ARGUMENTS (fndecl); iter_parm;
+       iter_parm = DECL_CHAIN (iter_parm))
+    {
+      tree param = iter_parm;
+      if (tree parm_default_ssa = ssa_default_def (fun, iter_parm))
+       param = parm_default_ssa;
+      const region *param_reg = state->m_region_model->get_lvalue
(param, NULL);
+      const svalue *init_sval = mgr->get_or_create_initial_value
(param_reg);
+      smap->set_state (state->m_region_model, init_sval,
+                      tainted, NULL /*origin_new_sval*/, ext_state);
+      if (POINTER_TYPE_P (TREE_TYPE (param)))
+       {
+         const region *pointee_reg = mgr->get_symbolic_region
(init_sval);
+         /* Mark "*param" as tainted.  */
+         const svalue *init_pointee_sval
+           = mgr->get_or_create_initial_value (pointee_reg);
+         smap->set_state (state->m_region_model, init_pointee_sval,
+                          tainted, NULL /*origin_new_sval*/,
ext_state);
+       }
+    }
+
+  return true;
+}
+
+/* Custom event for use by tainted_function_info when a function
+   has been marked with __attribute__((tainted)).  */
+
+class tainted_function_custom_event : public custom_event
+{
+public:
+  tainted_function_custom_event (location_t loc, tree fndecl, int
depth)
+  : custom_event (loc, fndecl, depth),
+    m_fndecl (fndecl)
+  {
+  }
+
+  label_text get_desc (bool can_colorize) const FINAL OVERRIDE
+  {
+    return make_label_text
+      (can_colorize,
+       "function %qE marked with %<__attribute__((tainted))%>",
+       m_fndecl);
+  }
+
+private:
+  tree m_fndecl;
+};
+
+/* Custom exploded_edge info for top-level calls to a function
+   marked with __attribute__((tainted)).  */
+
+class tainted_function_info : public custom_edge_info
+{
+public:
+  tainted_function_info (tree fndecl)
+  : m_fndecl (fndecl)
+  {}
+
+  void print (pretty_printer *pp) const FINAL OVERRIDE
+  {
+    pp_string (pp, "call to tainted function");
+  };
+
+  bool update_model (region_model *,
+                    const exploded_edge *,
+                    region_model_context *) const FINAL OVERRIDE
+  {
+    /* No-op.  */
+    return true;
+  }
+
+  void add_events_to_path (checker_path *emission_path,
+                          const exploded_edge &) const FINAL
OVERRIDE
+  {
+    emission_path->add_event
+      (new tainted_function_custom_event
+       (DECL_SOURCE_LOCATION (m_fndecl), m_fndecl, 0));
+  }
+
+private:
+  tree m_fndecl;
+};
+
  /* Ensure that there is an exploded_node representing an external
call to
     FUN, adding it to the worklist if creating it.
@@ -2302,14 +2415,25 @@ exploded_graph::add_function_entry (function
*fun)
    program_state state (m_ext_state);
    state.push_frame (m_ext_state, fun);
+  custom_edge_info *edge_info = NULL;
+
+  if (lookup_attribute ("tainted", DECL_ATTRIBUTES (fun->decl)))
+    {
+      if (mark_params_as_tainted (&state, fun->decl, m_ext_state))
+       edge_info = new tainted_function_info (fun->decl);
+    }
+
    if (!state.m_valid)
      return NULL;
   exploded_node *enode = get_or_create_node (point, state, NULL);
    if (!enode)
-    return NULL;
+    {
+      delete edge_info;
+      return NULL;
+    }
-  add_edge (m_origin, enode, NULL);
+  add_edge (m_origin, enode, NULL, edge_info);
   m_functions_with_enodes.add (fun); @@ -2623,6 +2747,184 @@ toplevel_function_p (function *fun, logger
*logger)
    return true;
  }
+/* Custom event for use by tainted_call_info when a callback field
has been
+   marked with __attribute__((tainted)), for labelling the field.
*/
+
+class tainted_field_custom_event : public custom_event
+{
+public:
+  tainted_field_custom_event (tree field)
+  : custom_event (DECL_SOURCE_LOCATION (field), NULL_TREE, 0),
+    m_field (field)
+  {
+  }
+
+  label_text get_desc (bool can_colorize) const FINAL OVERRIDE
+  {
+    return make_label_text (can_colorize,
+                           "field %qE of %qT"
+                           " is marked with
%<__attribute__((tainted))%>",
+                           m_field, DECL_CONTEXT (m_field));
+  }
+
+private:
+  tree m_field;
+};
+
+/* Custom event for use by tainted_call_info when a callback field
has been
+   marked with __attribute__((tainted)), for labelling the function
used
+   in that callback.  */
+
+class tainted_callback_custom_event : public custom_event
+{
+public:
+  tainted_callback_custom_event (location_t loc, tree fndecl, int
depth,
+                                tree field)
+  : custom_event (loc, fndecl, depth),
+    m_field (field)
+  {
+  }
+
+  label_text get_desc (bool can_colorize) const FINAL OVERRIDE
+  {
+    return make_label_text (can_colorize,
+                           "function %qE used as initializer for
field %qE"
+                           " marked with
%<__attribute__((tainted))%>",
+                           m_fndecl, m_field);
+  }
+
+private:
+  tree m_field;
+};
+
+/* Custom edge info for use when adding a function used by a
callback field
+   marked with '__attribute__((tainted))'.   */
+
+class tainted_call_info : public custom_edge_info
+{
+public:
+  tainted_call_info (tree field, tree fndecl, location_t loc)
+  : m_field (field), m_fndecl (fndecl), m_loc (loc)
+  {}
+
+  void print (pretty_printer *pp) const FINAL OVERRIDE
+  {
+    pp_string (pp, "call to tainted field");
+  };
+
+  bool update_model (region_model *,
+                    const exploded_edge *,
+                    region_model_context *) const FINAL OVERRIDE
+  {
+    /* No-op.  */
+    return true;
+  }
+
+  void add_events_to_path (checker_path *emission_path,
+                          const exploded_edge &) const FINAL
OVERRIDE
+  {
+    /* Show the field in the struct declaration
+       e.g. "(1) field 'store' is marked with
'__attribute__((tainted))'"  */
+    emission_path->add_event
+      (new tainted_field_custom_event (m_field));
+
+    /* Show the callback in the initializer
+       e.g.
+       "(2) function 'gadget_dev_desc_UDC_store' used as initializer
+       for field 'store' marked with '__attribute__((tainted))'".
*/
+    emission_path->add_event
+      (new tainted_callback_custom_event (m_loc, m_fndecl, 0,
m_field));
+  }
+
+private:
+  tree m_field;
+  tree m_fndecl;
+  location_t m_loc;
+};
+
+/* Given an initializer at LOC for FIELD marked with
'__attribute__((tainted))'
+   initialized with FNDECL, add an entrypoint to FNDECL to EG (and
to its
+   worklist) where the params to FNDECL are marked as tainted.  */
+
+static void
+add_tainted_callback (exploded_graph *eg, tree field, tree fndecl,
+                     location_t loc)
+{
+  logger *logger = eg->get_logger ();
+
+  LOG_SCOPE (logger);
+
+  if (!gimple_has_body_p (fndecl))
+    return;
+
+  const extrinsic_state &ext_state = eg->get_ext_state ();
+
+  function *fun = DECL_STRUCT_FUNCTION (fndecl);
+  gcc_assert (fun);
+
+  program_point point
+    = program_point::from_function_entry (eg->get_supergraph (),
fun);
+  program_state state (ext_state);
+  state.push_frame (ext_state, fun);
+
+  if (!mark_params_as_tainted (&state, fndecl, ext_state))
+    return;
+
+  if (!state.m_valid)
+    return;
+
+  exploded_node *enode = eg->get_or_create_node (point, state,
NULL);
+  if (logger)
+    {
+      if (enode)
+       logger->log ("created EN %i for tainted %qE entrypoint",
+                    enode->m_index, fndecl);
+      else
+       {
+         logger->log ("did not create enode for tainted %qE
entrypoint",
+                      fndecl);
+         return;
+       }
+    }
+
+  tainted_call_info *info = new tainted_call_info (field, fndecl,
loc);
+  eg->add_edge (eg->get_origin (), enode, NULL, info);
+}
+
+/* Callback for walk_tree for finding callbacks within initializers;
+   ensure that any callback initializer where the corresponding
field is
+   marked with '__attribute__((tainted))' is treated as an
entrypoint to the
+   analysis, special-casing that the inputs to the callback are
+   untrustworthy.  */
+
+static tree
+add_any_callbacks (tree *tp, int *, void *data)
+{
+  exploded_graph *eg = (exploded_graph *)data;
+  if (TREE_CODE (*tp) == CONSTRUCTOR)
+    {
+      /* Find fields with the "tainted" attribute.
+        walk_tree only walks the values, not the index values;
+        look at the index values.  */
+      unsigned HOST_WIDE_INT idx;
+      constructor_elt *ce;
+
+      for (idx = 0; vec_safe_iterate (CONSTRUCTOR_ELTS (*tp), idx,
&ce);
+          idx++)
+       if (ce->index && TREE_CODE (ce->index) == FIELD_DECL)
+         if (lookup_attribute ("tainted", DECL_ATTRIBUTES (ce-
index)))
+           {
+             tree value = ce->value;
+             if (TREE_CODE (value) == ADDR_EXPR
+                 && TREE_CODE (TREE_OPERAND (value, 0)) ==
FUNCTION_DECL)
+               add_tainted_callback (eg, ce->index, TREE_OPERAND
(value, 0),
+                                     EXPR_LOCATION (value));
+           }
+    }
+
+  return NULL_TREE;
+}
+
  /* Add initial nodes to EG, with entrypoints for externally-callable
     functions.  */
@@ -2648,6 +2950,17 @@ exploded_graph::build_initial_worklist ()
           logger->log ("did not create enode for %qE entrypoint",
fun->decl);
        }
    }
+
+  /* Find callbacks that are reachable from global initializers.  */
+  varpool_node *vpnode;
+  FOR_EACH_VARIABLE (vpnode)
+    {
+      tree decl = vpnode->decl;
+      tree init = DECL_INITIAL (decl);
+      if (!init)
+       continue;
+      walk_tree (&init, add_any_callbacks, this, NULL);
+    }
  }
 /* The main loop of the analysis.
diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
index 9e03156de5e..835ba6e0e8c 100644
--- a/gcc/c-family/c-attribs.c
+++ b/gcc/c-family/c-attribs.c
@@ -117,6 +117,7 @@ static tree
handle_no_profile_instrument_function_attribute (tree *, tree,
                                                              tree,
int, bool *);
  static tree handle_malloc_attribute (tree *, tree, tree, int, bool
*);
  static tree handle_dealloc_attribute (tree *, tree, tree, int, bool
*);
+static tree handle_tainted_attribute (tree *, tree, tree, int, bool
*);
  static tree handle_returns_twice_attribute (tree *, tree, tree, int,
bool *);
  static tree handle_no_limit_stack_attribute (tree *, tree, tree,
int,
                                              bool *);
@@ -569,6 +570,8 @@ const struct attribute_spec
c_common_attribute_table[] =
                               handle_objc_nullability_attribute, NULL
},
    { "*dealloc",                1, 2, true, false, false, false,
                               handle_dealloc_attribute, NULL },
+  { "tainted",               0, 0, true,  false, false, false,
+                             handle_tainted_attribute, NULL },
    { NULL,                     0, 0, false, false, false, false,
NULL, NULL }
  };
@@ -5857,6 +5860,39 @@ handle_objc_nullability_attribute (tree *node,
tree name, tree args,
    return NULL_TREE;
  }
+/* Handle a "tainted" attribute; arguments as in
+   struct attribute_spec.handler.  */
+
+static tree
+handle_tainted_attribute (tree *node, tree name, tree, int,
+                         bool *no_add_attrs)
+{
+  if (TREE_CODE (*node) != FUNCTION_DECL
+      && TREE_CODE (*node) != FIELD_DECL)
+    {
+      warning (OPT_Wattributes, "%qE attribute ignored; valid only "
+              "for functions and function pointer fields",
+              name);
+      *no_add_attrs = true;
+      return NULL_TREE;
+    }
+
+  if (TREE_CODE (*node) == FIELD_DECL
+      && !(TREE_CODE (TREE_TYPE (*node)) == POINTER_TYPE
+          && TREE_CODE (TREE_TYPE (TREE_TYPE (*node))) ==
FUNCTION_TYPE))
+    {
+      warning (OPT_Wattributes, "%qE attribute ignored;"
+              " field must be a function pointer",
+              name);
+      *no_add_attrs = true;
+      return NULL_TREE;
+    }
+
+  *no_add_attrs = false; /* OK */
+
+  return NULL_TREE;
+}
+
  /* Attempt to partially validate a single attribute ATTR as if
     it were to be applied to an entity OPER.  */
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 5a6ef464779..826bbd48e7e 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -2465,7 +2465,8 @@ variable declarations (@pxref{Variable
Attributes}),
  labels (@pxref{Label Attributes}),
  enumerators (@pxref{Enumerator Attributes}),
  statements (@pxref{Statement Attributes}),
-and types (@pxref{Type Attributes}).
+types (@pxref{Type Attributes}),
+and on field declarations (for @code{tainted}).
 There is some overlap between the purposes of attributes and pragmas
  (@pxref{Pragmas,,Pragmas Accepted by GCC}).  It has been
@@ -3977,6 +3978,25 @@ addition to creating a symbol version (as if
  @code{"@var{name2}@@@var{nodename}"} was used) the version will be
also used
  to resolve @var{name2} by the linker.
+@item tainted
+@cindex @code{tainted} function attribute
+The @code{tainted} attribute is used to specify that a function is
called
+in a way that requires sanitization of its arguments, such as a
system
+call in an operating system kernel.  Such a function can be
considered part
+of the ``attack surface'' of the program.  The attribute can be used
both
+on function declarations, and on field declarations containing
function
+pointers.  In the latter case, any function used as an initializer
of
+such a callback field will be treated as tainted.
+
+The analyzer will pay particular attention to such functions when
both
+@option{-fanalyzer} and @option{-fanalyzer-checker=taint} are
supplied,
+potentially issuing warnings guarded by
+@option{-Wanalyzer-exposure-through-uninit-copy},
+@option{-Wanalyzer-tainted-allocation-size},
+@option{-Wanalyzer-tainted-array-index},
+@option{Wanalyzer-tainted-offset},
+and @option{Wanalyzer-tainted-size}.
+
  @item target_clones (@var{options})
  @cindex @code{target_clones} function attribute
  The @code{target_clones} attribute is used to specify that a
function
diff --git a/gcc/testsuite/gcc.dg/analyzer/attr-tainted-1.c
b/gcc/testsuite/gcc.dg/analyzer/attr-tainted-1.c
new file mode 100644
index 00000000000..cc4d5900372
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/attr-tainted-1.c
@@ -0,0 +1,88 @@
+// TODO: remove need for this option
+/* { dg-additional-options "-fanalyzer-checker=taint" } */
+
+#include "analyzer-decls.h"
+
+struct arg_buf
+{
+  int i;
+  int j;
+};
+
+/* Example of marking a function as tainted.  */
+
+void __attribute__((tainted))
+test_1 (int i, void *p, char *q)
+{
+  /* There should be a single enode,
+     for the "tainted" entry to the function.  */
+  __analyzer_dump_exploded_nodes (0); /* { dg-warning "1 processed
enode" } */
+
+  __analyzer_dump_state ("taint", i); /* { dg-warning "state:
'tainted'" } */
+  __analyzer_dump_state ("taint", p); /* { dg-warning "state:
'tainted'" } */
+  __analyzer_dump_state ("taint", q); /* { dg-warning "state:
'tainted'" } */
+  __analyzer_dump_state ("taint", *q); /* { dg-warning "state:
'tainted'" } */
+
+  struct arg_buf *args = p;
+  __analyzer_dump_state ("taint", args->i); /* { dg-warning "state:
'tainted'" } */
+  __analyzer_dump_state ("taint", args->j); /* { dg-warning "state:
'tainted'" } */
+}
+
+/* Example of marking a callback field as tainted.  */
+
+struct s2
+{
+  void (*cb) (int, void *, char *)
+    __attribute__((tainted));
+};
+
+/* Function not marked as tainted.  */
+
+void
+test_2a (int i, void *p, char *q)
+{
+  /* There should be a single enode,
+     for the normal entry to the function.  */
+  __analyzer_dump_exploded_nodes (0); /* { dg-warning "1 processed
enode" } */
+
+  __analyzer_dump_state ("taint", i); /* { dg-warning "state:
'start'" } */
+  __analyzer_dump_state ("taint", p); /* { dg-warning "state:
'start'" } */
+  __analyzer_dump_state ("taint", q); /* { dg-warning "state:
'start'" } */
+
+  struct arg_buf *args = p;
+  __analyzer_dump_state ("taint", args->i); /* { dg-warning "state:
'start'" } */
+  __analyzer_dump_state ("taint", args->j); /* { dg-warning "state:
'start'" } */
+}
+
+/* Function referenced via t2b.cb, marked as "tainted".  */
+
+void
+test_2b (int i, void *p, char *q)
+{
+  /* There should be two enodes
+     for the direct call, and the "tainted" entry to the function.
*/
+  __analyzer_dump_exploded_nodes (0); /* { dg-warning "2 processed
enodes" } */
+}
+
+/* Callback used via t2c.cb, marked as "tainted".  */
+void
+__analyzer_test_2c (int i, void *p, char *q)
+{
+  /* There should be a single enode,
+     for the "tainted" entry to the function.  */
+  __analyzer_dump_exploded_nodes (0); /* { dg-warning "1 processed
enode" } */
+
+  __analyzer_dump_state ("taint", i); /* { dg-warning "state:
'tainted'" } */
+  __analyzer_dump_state ("taint", p); /* { dg-warning "state:
'tainted'" } */
+  __analyzer_dump_state ("taint", q); /* { dg-warning "state:
'tainted'" } */
+}
+
+struct s2 t2b =
+{
+  .cb = test_2b
+};
+
+struct s2 t2c =
+{
+  .cb = __analyzer_test_2c
+};
diff --git a/gcc/testsuite/gcc.dg/analyzer/attr-tainted-misuses.c
b/gcc/testsuite/gcc.dg/analyzer/attr-tainted-misuses.c
new file mode 100644
index 00000000000..6f4cbc82efb
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/attr-tainted-misuses.c
@@ -0,0 +1,6 @@
+int not_a_fn __attribute__ ((tainted)); /* { dg-warning "'tainted'
attribute ignored; valid only for functions and function pointer
fields" } */
+
+struct s
+{
+  int f __attribute__ ((tainted)); /* { dg-warning "'tainted'
attribute ignored; field must be a function pointer" } */
+};
diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2011-2210-1.c
b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2011-2210-1.c
new file mode 100644
index 00000000000..fe6c7ebbb1f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2011-2210-1.c
@@ -0,0 +1,93 @@
+/* "The osf_getsysinfo function in arch/alpha/kernel/osf_sys.c in
the
+   Linux kernel before 2.6.39.4 on the Alpha platform does not
properly
+   restrict the data size for GSI_GET_HWRPB operations, which allows
+   local users to obtain sensitive information from kernel memory
via
+   a crafted call."
+
+   Fixed in 3d0475119d8722798db5e88f26493f6547a4bb5b on linux-
2.6.39.y
+   in linux-stable.  */
+
+// TODO: remove need for this option:
+/* { dg-additional-options "-fanalyzer-checker=taint" } */
+
+#include "analyzer-decls.h"
+#include "test-uaccess.h"
+
+/* Adapted from include/linux/linkage.h.  */
+
+#define asmlinkage
+
+/* Adapted from include/linux/syscalls.h.  */
+
+#define __SC_DECL1(t1, a1)     t1 a1
+#define __SC_DECL2(t2, a2, ...) t2 a2, __SC_DECL1(__VA_ARGS__)
+#define __SC_DECL3(t3, a3, ...) t3 a3, __SC_DECL2(__VA_ARGS__)
+#define __SC_DECL4(t4, a4, ...) t4 a4, __SC_DECL3(__VA_ARGS__)
+#define __SC_DECL5(t5, a5, ...) t5 a5, __SC_DECL4(__VA_ARGS__)
+#define __SC_DECL6(t6, a6, ...) t6 a6, __SC_DECL5(__VA_ARGS__)
+
+#define SYSCALL_DEFINEx(x, sname, ...)                         \
+       __SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
+
+#define SYSCALL_DEFINE(name) asmlinkage long sys_##name
+#define __SYSCALL_DEFINEx(x, name,
...)                                        \
+       asmlinkage __attribute__((tainted)) \
+       long sys##name(__SC_DECL##x(__VA_ARGS__))
+
+#define SYSCALL_DEFINE5(name, ...) SYSCALL_DEFINEx(5, _##name,
__VA_ARGS__)
+
+/* Adapted from arch/alpha/include/asm/hwrpb.h.  */
+
+struct hwrpb_struct {
+       unsigned long phys_addr;        /* check: physical address of
the hwrpb */
+       unsigned long id;               /* check: "HWRPB\0\0\0" */
+       unsigned long revision;
+       unsigned long size;             /* size of hwrpb */
+       /* [...snip...] */
+};
+
+extern struct hwrpb_struct *hwrpb;
+
+/* Adapted from arch/alpha/kernel/osf_sys.c.  */
+
+SYSCALL_DEFINE5(osf_getsysinfo, unsigned long, op, void __user *,
buffer,
+               unsigned long, nbytes, int __user *, start, void
__user *, arg)
+{
+       /* [...snip...] */
+
+       __analyzer_dump_state ("taint", nbytes);  /* { dg-warning
"tainted" } */
+
+       /* TODO: should have an event explaining why "nbytes" is
treated as
+          attacker-controlled.  */
+
+       /* case GSI_GET_HWRPB: */
+               if (nbytes < sizeof(*hwrpb))
+                       return -1;
+
+               __analyzer_dump_state ("taint", nbytes);  /* { dg-
warning "has_lb" } */
+
+               if (copy_to_user(buffer, hwrpb, nbytes) != 0) /* {
dg-warning "use of attacker-controlled value 'nbytes' as size without
upper-bounds checking" } */
+                       return -2;
+
+               return 1;
+
+       /* [...snip...] */
+}
+
+/* With the fix for the sense of the size comparison.  */
+
+SYSCALL_DEFINE5(osf_getsysinfo_fixed, unsigned long, op, void __user
*, buffer,
+               unsigned long, nbytes, int __user *, start, void
__user *, arg)
+{
+       /* [...snip...] */
+
+       /* case GSI_GET_HWRPB: */
+               if (nbytes > sizeof(*hwrpb))
+                       return -1;
+               if (copy_to_user(buffer, hwrpb, nbytes) != 0) /* {
dg-bogus "attacker-controlled" } */
+                       return -2;
+
+               return 1;
+
+       /* [...snip...] */
+}
diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-1.c
b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-1.c
new file mode 100644
index 00000000000..0b9a94a8d6c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-1.c
@@ -0,0 +1,38 @@
+/* See notes in this header.  */
+#include "taint-CVE-2020-13143.h"
+
+// TODO: remove need for this option
+/* { dg-additional-options "-fanalyzer-checker=taint" } */
+
+struct configfs_attribute {
+       /* [...snip...] */
+       ssize_t (*store)(struct config_item *, const char *, size_t)
/* { dg-message "\\(1\\) field 'store' of 'struct configfs_attribute'
is marked with '__attribute__\\(\\(tainted\\)\\)'" } */
+               __attribute__((tainted)); /* (this is added).  */
+};
+static inline struct gadget_info *to_gadget_info(struct config_item
*item)
+{
+        return container_of(to_config_group(item), struct
gadget_info, group);
+}
+
+static ssize_t gadget_dev_desc_UDC_store(struct config_item *item,
+               const char *page, size_t len)
+{
+       struct gadget_info *gi = to_gadget_info(item);
+       char *name;
+       int ret;
+
+#if 0
+       /* FIXME: this is the fix.  */
+       if (strlen(page) < len)
+               return -EOVERFLOW;
+#endif
+
+       name = kstrdup(page, GFP_KERNEL);
+       if (!name)
+               return -ENOMEM;
+       if (name[len - 1] == '\n') /* { dg-warning "use of attacker-
controlled value 'len \[^\n\r\]+' as offset without upper-bounds
checking" } */
+               name[len - 1] = '\0'; /* { dg-warning "use of
attacker-controlled value 'len \[^\n\r\]+' as offset without upper-
bounds checking" } */
+       /* [...snip...] */                              \
+}
+
+CONFIGFS_ATTR(gadget_dev_desc_, UDC); /* { dg-message "\\(2\\)
function 'gadget_dev_desc_UDC_store' used as initializer for field
'store' marked with '__attribute__\\(\\(tainted\\)\\)'" } */
diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-2.c
b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-2.c
new file mode 100644
index 00000000000..e05da9276c1
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143-2.c
@@ -0,0 +1,32 @@
+/* See notes in this header.  */
+#include "taint-CVE-2020-13143.h"
+
+// TODO: remove need for this option
+/* { dg-additional-options "-fanalyzer-checker=taint" } */
+
+struct configfs_attribute {
+       /* [...snip...] */
+       ssize_t (*store)(struct config_item *, const char *, size_t)
/* { dg-message "\\(1\\) field 'store' of 'struct configfs_attribute'
is marked with '__attribute__\\(\\(tainted\\)\\)'" } */
+               __attribute__((tainted)); /* (this is added).  */
+};
+
+/* Highly simplified version.  */
+
+static ssize_t gadget_dev_desc_UDC_store(struct config_item *item,
+               const char *page, size_t len)
+{
+       /* TODO: ought to have state_change_event talking about where
the tainted value comes from.  */
+
+       char *name;
+       /* [...snip...] */
+
+       name = kstrdup(page, GFP_KERNEL);
+       if (!name)
+               return -ENOMEM;
+       if (name[len - 1] == '\n') /* { dg-warning "use of attacker-
controlled value 'len \[^\n\r\]+' as offset without upper-bounds
checking" } */
+               name[len - 1] = '\0';  /* { dg-warning "use of
attacker-controlled value 'len \[^\n\r\]+' as offset without upper-
bounds checking" } */
+       /* [...snip...] */
+       return 0;
+}
+
+CONFIGFS_ATTR(gadget_dev_desc_, UDC); /* { dg-message "\\(2\\)
function 'gadget_dev_desc_UDC_store' used as initializer for field
'store' marked with '__attribute__\\(\\(tainted\\)\\)'" } */
diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143.h
b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143.h
new file mode 100644
index 00000000000..0ba023539af
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/taint-CVE-2020-13143.h
@@ -0,0 +1,91 @@
+/* Shared header for the various taint-CVE-2020-13143.h tests.
+
+   "gadget_dev_desc_UDC_store in drivers/usb/gadget/configfs.c in
the
+   Linux kernel 3.16 through 5.6.13 relies on kstrdup without
considering
+   the possibility of an internal '\0' value, which allows attackers
to
+   trigger an out-of-bounds read, aka CID-15753588bcd4."
+
+   Fixed by 15753588bcd4bbffae1cca33c8ced5722477fe1f on linux-5.7.y
+   in linux-stable.  */
+
+// TODO: remove need for this option
+/* { dg-additional-options "-fanalyzer-checker=taint" } */
+
+#include <stddef.h>
+
+/* Adapted from include/uapi/asm-generic/posix_types.h  */
+
+typedef unsigned int     __kernel_size_t;
+typedef int              __kernel_ssize_t;
+
+/* Adapted from include/linux/types.h  */
+
+//typedef __kernel_size_t              size_t;
+typedef __kernel_ssize_t       ssize_t;
+
+/* Adapted from include/linux/kernel.h  */
+
+#define container_of(ptr, type, member)
({                             \
+       void *__mptr = (void
*)(ptr);                                   \
+       /* [...snip...]
*/                                              \
+       ((type *)(__mptr - offsetof(type, member))); })
+
+/* Adapted from include/linux/configfs.h  */
+
+struct config_item {
+       /* [...snip...] */
+};
+
+struct config_group {
+       struct config_item              cg_item;
+       /* [...snip...] */
+};
+
+static inline struct config_group *to_config_group(struct
config_item *item)
+{
+       return item ? container_of(item,struct config_group,cg_item)
: NULL;
+}
+
+#define CONFIGFS_ATTR(_pfx, _name)                             \
+static struct configfs_attribute _pfx##attr_##_name = {        \
+       /* [...snip...] */                              \
+       .store          = _pfx##_name##_store,          \
+}
+
+/* Adapted from include/linux/compiler.h  */
+
+#define __force
+
+/* Adapted from include/asm-generic/errno-base.h  */
+
+#define        ENOMEM          12      /* Out of memory */
+
+/* Adapted from include/linux/types.h  */
+
+#define __bitwise__
+typedef unsigned __bitwise__ gfp_t;
+
+/* Adapted from include/linux/gfp.h  */
+
+#define ___GFP_WAIT            0x10u
+#define ___GFP_IO              0x40u
+#define ___GFP_FS              0x80u
+#define __GFP_WAIT     ((__force gfp_t)___GFP_WAIT)
+#define __GFP_IO       ((__force gfp_t)___GFP_IO)
+#define __GFP_FS       ((__force gfp_t)___GFP_FS)
+#define GFP_KERNEL  (__GFP_WAIT | __GFP_IO | __GFP_FS)
+
+/* Adapted from include/linux/compiler_attributes.h  */
+
+#define __malloc                        __attribute__((__malloc__))
+
+/* Adapted from include/linux/string.h  */
+
+extern char *kstrdup(const char *s, gfp_t gfp) __malloc;
+
+/* Adapted from drivers/usb/gadget/configfs.c  */
+
+struct gadget_info {
+       struct config_group group;
+       /* [...snip...] */                              \
+};
diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-alloc-3.c
b/gcc/testsuite/gcc.dg/analyzer/taint-alloc-3.c
new file mode 100644
index 00000000000..4c567b2ffdf
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/taint-alloc-3.c
@@ -0,0 +1,21 @@
+// TODO: remove need for this option:
+/* { dg-additional-options "-fanalyzer-checker=taint" } */
+
+#include "analyzer-decls.h"
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+/* malloc with tainted size from a syscall.  */
+
+void *p;
+
+void __attribute__((tainted))
+test_1 (size_t sz) /* { dg-message "\\(1\\) function 'test_1' marked
with '__attribute__\\(\\(tainted\\)\\)'" } */
+{
+  /* TODO: should have a message saying why "sz" is tainted, e.g.
+     "treating 'sz' as attacker-controlled because 'test_1' is
marked with '__attribute__((tainted))'"  */
+
+  p = malloc (sz); /* { dg-warning "use of attacker-controlled value
'sz' as allocation size without upper-bounds checking" "warning" } */
+  /* { dg-message "\\(\[0-9\]+\\) use of attacker-controlled value
'sz' as allocation size without upper-bounds checking" "final event"
{ target *-*-* } .-1 } */
+}
diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-alloc-4.c
b/gcc/testsuite/gcc.dg/analyzer/taint-alloc-4.c
new file mode 100644
index 00000000000..f52cafcd71d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/taint-alloc-4.c
@@ -0,0 +1,31 @@
+// TODO: remove need for this option:
+/* { dg-additional-options "-fanalyzer-checker=taint" } */
+
+#include "analyzer-decls.h"
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+/* malloc with tainted size from a syscall.  */
+
+struct arg_buf
+{
+  size_t sz;
+};
+
+void *p;
+
+void __attribute__((tainted))
+test_1 (void *data) /* { dg-message "\\(1\\) function 'test_1'
marked with '__attribute__\\(\\(tainted\\)\\)'" } */
+{
+  /* we should treat pointed-to-structs as tainted.  */
+  __analyzer_dump_state ("taint", data); /* { dg-warning "state:
'tainted'" } */
+
+  struct arg_buf *args = data;
+
+  __analyzer_dump_state ("taint", args); /* { dg-warning "state:
'tainted'" } */
+  __analyzer_dump_state ("taint", args->sz); /* { dg-warning "state:
'tainted'" } */
+
+  p = malloc (args->sz); /* { dg-warning "use of attacker-controlled
value '\\*args.sz' as allocation size without upper-bounds checking"
"warning" } */
+  /* { dg-message "\\(\[0-9\]+\\) use of attacker-controlled value
'\\*args.sz' as allocation size without upper-bounds checking" "final
event" { target *-*-* } .-1 } */
+}




Reply via email to