Re: [PATCH v2] analyzer: More features for CPython analyzer plugin [PR107646]

David Malcolm via Gcc Wed, 09 Aug 2023 14:36:52 -0700

On Wed, 2023-08-09 at 15:22 -0400, Eric Feng wrote:
> Thank you for your help in getting dg-require-python-h working! I can
> confirm that the FAILs are related to differences between the --
> cflags
> affecting the gimple seen by the analyzer. For this reason, I have
> changed it to --includes for now.


Sounds good.

Eventually we'll probably want to support --cflags, but given that
every distribution probably has its own set of flags, it's a recipe for
an unpleasantly large test matrix, so just using --includes is a good
compromise.

> To be sure, I tested on Python 3.8 as
> well and it works as expected. I have also addressed the following
> comments on the WIP patch as you described.
> 
> -- Update Changelog entry to list new functions being simulated.
> -- Update region_model::get_or_create_region_for_heap_alloc leading
> comment.
> -- Change register_alloc to update_state_machine.
> -- Change move_ptr_sval_non_null to transition_ptr_sval_non_null.
> -- Static helper functions for:
>         -- Initializing ob_refcnt field.
>         -- Setting ob_type field.
>         -- Getting ob_base field.
>         -- Initializing heap allocated region for PyObjects.
>         -- Incrementing a field by one.
> -- Change arg_is_long_p to arg_is_integral_p.
> -- Extract common failure scenario for reusability.
> 
> The initial WIP patch using 
> 
> /* { dg-options "-fanalyzer -I/usr/include/python3.9" }. */
> 
> have been bootstrapped and regtested on aarch64-unknown-linux-gnu.
> Since
> we did not change any core logic in the revision and the only changes
> within the analyzer core are changing variable names, is it OK for
> trunk. In the mean time, the revised patch is currently going through
> bootstrap and regtest process.

Thanks for the updated patch.

Unfortunately I just pushed a largish analyzer patch (r14-3114-
g73da34a538ddc2) which may well conflict with your patch, so please
rebase to beyond that.  

Sorry about this.

In particular note that there's no longer a default assignment to the
LHS at a call-site in region_model::on_call_pre; known_function
subclasses are now responsible for assigning to the LHS of the
callsite.  But I suspect that all the known_function subclasses in the
cpython plugin already do that.

Some nits inline below...

[...snip...]

> Some concessions were made to
> simplify the analysis process when comparing kf_PyList_Append with
> the
> real implementation. In particular, PyList_Append performs some
> optimization internally to try and avoid calls to realloc if
> possible. For simplicity, we assume that realloc is called every
> time.
> Also, we grow the size by just 1 (to ensure enough space for adding a
> new element) rather than abide by the heuristics that the actual
> implementation
> follows.

Might be worth capturing these notes as comments in the source (for
class kf_PyList_Append), rather than just within the commit message.

[...snip...]
> 
> diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-
> model.cc
> index e92b3f7b074..c338f045d92 100644
> --- a/gcc/analyzer/region-model.cc
> +++ b/gcc/analyzer/region-model.cc
> @@ -5127,11 +5127,16 @@ region_model::check_dynamic_size_for_floats
> (const svalue *size_in_bytes,
>     Use CTXT to complain about tainted sizes.
>  
>     Reuse an existing heap_allocated_region if it's not being
> referenced by
> -   this region_model; otherwise create a new one.  */
> +   this region_model; otherwise create a new one.
> +
> +   Optionally (update_state_machine) transitions the pointer
> pointing to the
> +   heap_allocated_region from start to assumed non-null.  */
>  
>  const region *
>  region_model::get_or_create_region_for_heap_alloc (const svalue
> *size_in_bytes,
> -                                                 
> region_model_context *ctxt)
> +       region_model_context *ctxt,
> +       bool update_state_machine,
> +       const call_details *cd)
>  {
>    /* Determine which regions are referenced in this region_model, so
> that
>       we can reuse an existing heap_allocated_region if it's not in
> use on
> @@ -5153,6 +5158,17 @@
> region_model::get_or_create_region_for_heap_alloc (const svalue
> *size_in_bytes,
>    if (size_in_bytes)
>      if (compat_types_p (size_in_bytes->get_type (), size_type_node))
>        set_dynamic_extents (reg, size_in_bytes, ctxt);
> +
> +       if (update_state_machine && cd)
> +               {
> +                       const svalue *ptr_sval = nullptr;
> +                       if (cd->get_lhs_type ())
> +       ptr_sval = m_mgr->get_ptr_svalue (cd->get_lhs_type (), reg);
> +                       else
> +       ptr_sval = m_mgr->get_ptr_svalue (NULL_TREE, reg);
> +                       transition_ptr_sval_non_null (ctxt,
> ptr_sval);

This if/else is redundant: the "else" is only reached if cd-
>get_lhs_type () is null, in which case you pass in NULL_TREE, so it
works the same either way.  Or am I missing something?

Also, it looks like something weird's happening with indentation in
this hunk.

> +               }
> +
>    return reg;
>  }
>  
> diff --git a/gcc/analyzer/region-model.h b/gcc/analyzer/region-
> model.h
> index 0cf38714c96..16c80a238bc 100644
> --- a/gcc/analyzer/region-model.h
> +++ b/gcc/analyzer/region-model.h
> @@ -387,9 +387,9 @@ class region_model
>                        region_model_context *ctxt,
>                        rejected_constraint **out);
>  
> -  const region *
> -  get_or_create_region_for_heap_alloc (const svalue *size_in_bytes,
> -                                      region_model_context *ctxt);
> +  const region *get_or_create_region_for_heap_alloc (
> +      const svalue *size_in_bytes, region_model_context *ctxt,
> +      bool update_state_machine = false, const call_details *cd =
> nullptr);

Nit: non-standard indentation here.

>    const region *create_region_for_alloca (const svalue
> *size_in_bytes,
>                                           region_model_context
> *ctxt);
>    void get_referenced_base_regions (auto_bitmap &out_ids) const;
> @@ -476,6 +476,10 @@ class region_model
>                              const svalue *old_ptr_sval,
>                              const svalue *new_ptr_sval);
>  
> +  /* Implemented in sm-malloc.cc.  */
> +  void transition_ptr_sval_non_null (region_model_context *ctxt,
> +       const svalue *new_ptr_sval);

Nit: non-standard indentation here.

> +
>    /* Implemented in sm-taint.cc.  */
>    void mark_as_tainted (const svalue *sval,
>                         region_model_context *ctxt);
> diff --git a/gcc/analyzer/sm-malloc.cc b/gcc/analyzer/sm-malloc.cc
> index a8c63eb1ce8..bb8d83e4605 100644
> --- a/gcc/analyzer/sm-malloc.cc
> +++ b/gcc/analyzer/sm-malloc.cc
> @@ -434,6 +434,10 @@ public:
>                              const svalue *new_ptr_sval,
>                              const extrinsic_state &ext_state) const;
>  
> +  void transition_ptr_sval_non_null (region_model *model,
> sm_state_map *smap,
> +       const svalue *new_ptr_sval,
> +       const extrinsic_state &ext_state) const;

Nit: non-standard indentation here.

> +
>    standard_deallocator_set m_free;
>    standard_deallocator_set m_scalar_delete;
>    standard_deallocator_set m_vector_delete;
> @@ -2504,6 +2508,16 @@ on_realloc_with_move (region_model *model,
>                    NULL, ext_state);
>  }
>  
> +/*  Hook for get_or_create_region_for_heap_alloc for the case when
> we want
> +   ptr_sval to mark a newly created region as assumed non null on
> malloc SM.  */
> +void
> +malloc_state_machine::transition_ptr_sval_non_null (
> +    region_model *model, sm_state_map *smap, const svalue
> *new_ptr_sval,
> +    const extrinsic_state &ext_state) const

Nit: non-standard indentation here.

> +{
> +  smap->set_state (model, new_ptr_sval, m_free.m_nonnull, NULL,
> ext_state);
> +}
> 
[...]

> diff --git a/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
> b/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
> index 9ecc42d4465..42c8aff101e 100644
> --- a/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
> +++ b/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c

[...]

> @@ -76,6 +78,703 @@ get_field_by_name (tree type, const char *name)
>    return NULL_TREE;
>  }
>  
> +static const svalue *
> +get_sizeof_pyobjptr (region_model_manager *mgr)
> +{
> +  tree size_tree = TYPE_SIZE_UNIT (pyobj_ptr_tree);
> +  const svalue *sizeof_sval = mgr->get_or_create_constant_svalue
> (size_tree);
> +  return sizeof_sval;
> +}
> +
> +static bool
> +arg_is_integral_p(const call_details &cd, unsigned idx)
> +{
> +  return INTEGRAL_TYPE_P(cd.get_arg_type(idx));
> +}

We already have a call_details::arg_is_pointer_p, so perhaps this
should be a call_details::arg_is_integral_p?

> +
> +static void
> +init_ob_refcnt_field (region_model_manager *mgr, region_model
> *model,
> +                      const region *ob_base_region, tree
> pyobj_record,
> +                      const call_details &cd)
> +{
> +  tree ob_refcnt_tree = get_field_by_name (pyobj_record,
> "ob_refcnt");
> +  const region *ob_refcnt_region
> +      = mgr->get_field_region (ob_base_region, ob_refcnt_tree);
> +  const svalue *refcnt_one_sval
> +      = mgr->get_or_create_int_cst (size_type_node, 1);
> +  model->set_value (ob_refcnt_region, refcnt_one_sval, cd.get_ctxt
> ());
> +}

Please add a leading comment to this function, something like:

/* Update MODEL to set OB_BASE_REGION's ob_refcnt to 1.  */

> +
> +static void
> +set_ob_type_field (region_model_manager *mgr, region_model *model,
> +                   const region *ob_base_region, tree pyobj_record,
> +                   tree pytype_var_decl_ptr, const call_details &cd)
> +{
> +  const region *pylist_type_region
> +      = mgr->get_region_for_global (pytype_var_decl_ptr);
> +  tree pytype_var_decl_ptr_type
> +      = build_pointer_type (TREE_TYPE (pytype_var_decl_ptr));
> +  const svalue *pylist_type_ptr_sval
> +      = mgr->get_ptr_svalue (pytype_var_decl_ptr_type,
> pylist_type_region);
> +  tree ob_type_field = get_field_by_name (pyobj_record, "ob_type");
> +  const region *ob_type_region
> +      = mgr->get_field_region (ob_base_region, ob_type_field);
> +  model->set_value (ob_type_region, pylist_type_ptr_sval,
> cd.get_ctxt ());
> +}

Likewise, this needs a leading comment, something like:

/* Update MODEL to set OB_BASE_REGION's ob_type to point to
   PYTYPE_VAR_DECL_PTR.  */

> +
> +static const region *
> +get_ob_base_region (region_model_manager *mgr, region_model *model,
> +                   const region *new_object_region, tree
> object_record,
> +                   const svalue *pyobj_svalue, const call_details
> &cd)
> +{
> +  tree ob_base_tree = get_field_by_name (object_record, "ob_base");
> +  const region *ob_base_region
> +      = mgr->get_field_region (new_object_region, ob_base_tree);
> +  model->set_value (ob_base_region, pyobj_svalue, cd.get_ctxt ());
> +  return ob_base_region;
> +}

Likewise, needs a leading comment.  It isn't clear to me what the
intent of this function is.  I see it used from
kf_PyLong_FromLong::impl_call_post's outcome handler, where it seems to
be used to set the ob_base_region to an unknown value.

> +
> +static const region *
> +init_pyobject_region (region_model_manager *mgr, region_model
> *model,
> +                      const svalue *object_svalue, const
> call_details &cd)
> +{
> +  /* TODO: switch to actual tp_basic_size */
> +  const svalue *tp_basicsize_sval = mgr-
> >get_or_create_unknown_svalue (NULL);
> +  const region *pyobject_region = model-
> >get_or_create_region_for_heap_alloc (
> +      tp_basicsize_sval, cd.get_ctxt (), true, &cd);
> +  model->set_value (pyobject_region, object_svalue, cd.get_ctxt ());
> +  return pyobject_region;
> +}

Likewise needs a leading comment, and the exact intent isn't quite
clear to me.  I believe that everywhere you're calling it, you're
passing in an unknown svalue for "object_svalue".

[...]

> +class kf_PyList_Append : public known_function
> +{
> +public:
> +  bool
> +  matches_call_types_p (const call_details &cd) const final override
> +  {
> +    return (cd.num_args () == 2); // TODO: more checks here

Probably: 
  && cd.arg_is_pointer_p (0)
  && cd.arg_is_pointer_p (1)

> +  }
> +  void impl_call_pre (const call_details &cd) const final override;
> +  void impl_call_post (const call_details &cd) const final override;
> +};
> +

[...snip kf_PyList_Append implementation...]

I confess that my eyes started glazing over at the kf_PyList_Append
implementation.  Given that this is just within the test plugin, I'll
defer to you on this.


> +class kf_PyList_New : public known_function
> +{
> +public:
> +  bool
> +  matches_call_types_p (const call_details &cd) const final override
> +  {
> +    return (cd.num_args () == 1 && arg_is_integral_p (cd, 0));
> +  }
> +  void impl_call_post (const call_details &cd) const final override;
> +};
> +
> +void
> +kf_PyList_New::impl_call_post (const call_details &cd) const
> +{
> +  class success : public call_info
> +  {
> +  public:
> +    success (const call_details &cd) : call_info (cd) {}
> +
> +    label_text
> +    get_desc (bool can_colorize) const final override
> +    {
> +      return make_label_text (can_colorize, "when %qE succeeds",
> +                              get_fndecl ());
> +    }
> +
> +    bool
> +    update_model (region_model *model, const exploded_edge *,
> +                  region_model_context *ctxt) const final override
> +    {
> +      const call_details cd (get_call_details (model, ctxt));
> +      region_model_manager *mgr = cd.get_manager ();
> +
> +      const svalue *pyobj_svalue
> +          = mgr->get_or_create_unknown_svalue (pyobj_record);
> +      const svalue *varobj_svalue
> +          = mgr->get_or_create_unknown_svalue (varobj_record);
> +      const svalue *pylist_svalue
> +          = mgr->get_or_create_unknown_svalue (pylistobj_record);
> +
> +      const svalue *size_sval = cd.get_arg_svalue (0);
> +
> +      const region *pylist_region
> +          = init_pyobject_region (mgr, model, pylist_svalue, cd);
> +
> +      /*
> +      typedef struct
> +      {
> +        PyObject_VAR_HEAD
> +        PyObject **ob_item;
> +        Py_ssize_t allocated;
> +      } PyListObject;
> +      */
> +      tree varobj_field = get_field_by_name (pylistobj_record,
> "ob_base");
> +      const region *varobj_region
> +          = mgr->get_field_region (pylist_region, varobj_field);
> +      model->set_value (varobj_region, varobj_svalue, cd.get_ctxt
> ());
> +
> +      tree ob_item_field = get_field_by_name (pylistobj_record,
> "ob_item");
> +      const region *ob_item_region
> +          = mgr->get_field_region (pylist_region, ob_item_field);
> +
> +      const svalue *zero_sval = mgr->get_or_create_int_cst
> (size_type_node, 0);
> +      const svalue *casted_size_sval
> +          = mgr->get_or_create_cast (size_type_node, size_sval);
> +      const svalue *size_cond_sval = mgr->get_or_create_binop (
> +          size_type_node, LE_EXPR, casted_size_sval, zero_sval);
> +
> +      // if size <= 0, ob_item = NULL
> +
> +      if (tree_int_cst_equal (size_cond_sval->maybe_get_constant (),
> +                              integer_one_node))
> +        {

I think the way I'd do this is to bifurcate on the <= 0 versus > 0
case, and add the constraints to the model, as this ought to better
handle non-constant values for the size.

But this is good enough for now.

> +          const svalue *null_sval
> +              = mgr->get_or_create_null_ptr (pyobj_ptr_ptr);
> +          model->set_value (ob_item_region, null_sval, cd.get_ctxt
> ());
> +        }
> +      else // calloc
> +        {
> +          const svalue *sizeof_sval = mgr->get_or_create_cast (
> +              size_sval->get_type (), get_sizeof_pyobjptr (mgr));
> +          const svalue *prod_sval = mgr->get_or_create_binop (
> +              size_type_node, MULT_EXPR, sizeof_sval, size_sval);
> +          const region *ob_item_sized_region
> +              = model->get_or_create_region_for_heap_alloc
> (prod_sval,
> +                                                           
> cd.get_ctxt ());
> +          model->zero_fill_region (ob_item_sized_region);
> +          const svalue *ob_item_ptr_sval
> +              = mgr->get_ptr_svalue (pyobj_ptr_ptr,
> ob_item_sized_region);
> +          const svalue *ob_item_unmergeable
> +              = mgr->get_or_create_unmergeable (ob_item_ptr_sval);
> +          model->set_value (ob_item_region, ob_item_unmergeable,
> +                            cd.get_ctxt ());
> +        }
> +
> +      /*
> +      typedef struct {
> +      PyObject ob_base;
> +      Py_ssize_t ob_size; // Number of items in variable part
> +      } PyVarObject;
> +      */
> +      const region *ob_base_region = get_ob_base_region (
> +          mgr, model, varobj_region, varobj_record, pyobj_svalue,
> cd);
> +
> +      tree ob_size_tree = get_field_by_name (varobj_record,
> "ob_size");
> +      const region *ob_size_region
> +          = mgr->get_field_region (varobj_region, ob_size_tree);
> +      model->set_value (ob_size_region, size_sval, cd.get_ctxt ());
> +
> +      /*
> +      typedef struct _object {
> +          _PyObject_HEAD_EXTRA
> +          Py_ssize_t ob_refcnt;
> +          PyTypeObject *ob_type;
> +      } PyObject;
> +      */
> +
> +      // Initialize ob_refcnt field to 1.
> +      init_ob_refcnt_field(mgr, model, ob_base_region, pyobj_record,
> cd);
> +
> +      // Get pointer svalue for PyList_Type then assign it to
> ob_type field.
> +      set_ob_type_field(mgr, model, ob_base_region, pyobj_record,
> pylisttype_vardecl, cd);
> +
> +      if (cd.get_lhs_type ())
> +        {
> +          const svalue *ptr_sval
> +              = mgr->get_ptr_svalue (cd.get_lhs_type (),
> pylist_region);
> +          cd.maybe_set_lhs (ptr_sval);
> +        }
> +      return true;
> +    }
> +  };
> +
> +  if (cd.get_ctxt ())
> +    {
> +      cd.get_ctxt ()->bifurcate (make_unique<pyobj_init_fail> (cd));
> +      cd.get_ctxt ()->bifurcate (make_unique<success> (cd));
> +      cd.get_ctxt ()->terminate_path ();
> +    }
> +}
> +
> +class kf_PyLong_FromLong : public known_function
> +{
> +public:
> +  bool
> +  matches_call_types_p (const call_details &cd) const final override
> +  {
> +    return (cd.num_args () == 1 && arg_is_integral_p (cd, 0));
> +  }
> +  void impl_call_post (const call_details &cd) const final override;
> +};
> +
> +void
> +kf_PyLong_FromLong::impl_call_post (const call_details &cd) const
> +{
> +  class success : public call_info
> +  {
> +  public:
> +    success (const call_details &cd) : call_info (cd) {}
> +
> +    label_text
> +    get_desc (bool can_colorize) const final override
> +    {
> +      return make_label_text (can_colorize, "when %qE succeeds",
> +                              get_fndecl ());
> +    }

If you subclass from success_call_info then you get an equivalent
get_desc implementation "for free".

> +
> +    bool
> +    update_model (region_model *model, const exploded_edge *,
> +                  region_model_context *ctxt) const final override
> +    {

Could add a comment here that we *don't* attempt to simulate the
special-casing that CPython does for values -5 to 256 (see
https://docs.python.org/3/c-api/long.html#c.PyLong_FromLong ).

> +      const call_details cd (get_call_details (model, ctxt));
> +      region_model_manager *mgr = cd.get_manager ();
> +
> +      const svalue *pyobj_svalue
> +          = mgr->get_or_create_unknown_svalue (pyobj_record);
> +      const svalue *pylongobj_sval
> +          = mgr->get_or_create_unknown_svalue (pylongobj_record);
> +
> +      const region *pylong_region
> +          = init_pyobject_region (mgr, model, pylongobj_sval, cd);
> +
> +      // Create a region for the base PyObject within the
> PyLongObject.
> +      const region *ob_base_region = get_ob_base_region (
> +          mgr, model, pylong_region, pylongobj_record, pyobj_svalue,
> cd);
> +
> +      // Initialize ob_refcnt field to 1.
> +      init_ob_refcnt_field(mgr, model, ob_base_region, pyobj_record,
> cd);
> +
> +      // Get pointer svalue for PyLong_Type then assign it to
> ob_type field.
> +      set_ob_type_field(mgr, model, ob_base_region, pyobj_record,
> pylongtype_vardecl, cd);
> +
> +      // Set the PyLongObject value.
> +      tree ob_digit_field = get_field_by_name (pylongobj_record,
> "ob_digit");
> +      const region *ob_digit_region
> +          = mgr->get_field_region (pylong_region, ob_digit_field);
> +      const svalue *ob_digit_sval = cd.get_arg_svalue (0);
> +      model->set_value (ob_digit_region, ob_digit_sval, cd.get_ctxt
> ());
> +
> +      if (cd.get_lhs_type ())
> +        {
> +          const svalue *ptr_sval
> +              = mgr->get_ptr_svalue (cd.get_lhs_type (),
> pylong_region);
> +          cd.maybe_set_lhs (ptr_sval);
> +        }
> +      return true;
> +    }
> +  };
> +
> +  if (cd.get_ctxt ())
> +    {
> +      cd.get_ctxt ()->bifurcate (make_unique<pyobj_init_fail> (cd));
> +      cd.get_ctxt ()->bifurcate (make_unique<success> (cd));
> +      cd.get_ctxt ()->terminate_path ();
> +    }
> +}

[...]


> diff --git a/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-2.c
> b/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-2.c
> new file mode 100644
> index 00000000000..19b5c17428a
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-2.c
> @@ -0,0 +1,78 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target analyzer } */
> +/* { dg-options "-fanalyzer" } */
> +/* { dg-require-python-h "" } */
> +
> +
> +#define PY_SSIZE_T_CLEAN
> +#include <Python.h>
> +#include "../analyzer/analyzer-decls.h"
> +
> +PyObject *
> +test_PyList_New (Py_ssize_t len)
> +{
> +  PyObject *obj = PyList_New (len);
> +  if (obj)
> +    {
> +     __analyzer_eval (obj->ob_refcnt == 1); /* { dg-warning "TRUE" }
> */
> +     __analyzer_eval (PyList_CheckExact (obj)); /* { dg-warning
> "TRUE" } */
> +    }
> +  else
> +    __analyzer_dump_path (); /* { dg-message "path" } */
> +  return obj;
> +}

There's lots of scope for extra test coverage here.

For example, for all these test_FOO_New cases, consider a variant with
"void" return and no "return obj;" at the end.  The analyzer ought to
report a leak when we fall off the end of these functions.

Similarly, it's good to try both:
- symbolic values for arguments (like you have here)
- constant values (for example, what happens with NULL for pointer
params?)

FWIW I like to organize test coverage for specific known functions into
test cases named after the function, so perhaps we could have:
  cpython-plugin-test-PyList_Append.c
  cpython-plugin-test-PyList_New.c
  cpython-plugin-test-PyLong_New.c
but that's up to you.

All this can be left to a follow-up, though.

> +
> +PyObject *
> +test_PyLong_New (long n)
> +{
> +  PyObject *obj = PyLong_FromLong (n);
> +  if (obj)
> +    {
> +     __analyzer_eval (obj->ob_refcnt == 1); /* { dg-warning "TRUE" }
> */
> +     __analyzer_eval (PyLong_CheckExact (obj)); /* { dg-warning
> "TRUE" } */
> +    }
> +  else
> +    __analyzer_dump_path (); /* { dg-message "path" } */
> +  return obj;
> +}
> +
> +PyObject *
> +test_PyListAppend (long n)
> +{
> +  PyObject *item = PyLong_FromLong (n);
> +  PyObject *list = PyList_New (0);
> +  PyList_Append(list, item);
> +  return list; /* { dg-warning "leak of 'item'" } */
> +}
> +
> +PyObject *
> +test_PyListAppend_2 (long n)
> +{
> +  PyObject *item = PyLong_FromLong (n);
> +  if (!item)
> +       return NULL;
> +
> +  __analyzer_eval (item->ob_refcnt == 1); /* { dg-warning "TRUE" }
> */
> +  PyObject *list = PyList_New (n);
> +  if (!list)
> +  {
> +       Py_DECREF(item);
> +       return NULL;
> +  }
> +
> +  __analyzer_eval (list->ob_refcnt == 1); /* { dg-warning "TRUE" }
> */
> +
> +  if (PyList_Append (list, item) < 0)
> +    __analyzer_eval (item->ob_refcnt == 1); /* { dg-warning "TRUE" }
> */
> +  else
> +    __analyzer_eval (item->ob_refcnt == 2); /* { dg-warning "TRUE" }
> */
> +  return list; /* { dg-warning "leak of 'item'" } */
> +}
> +
> +
> +PyObject *
> +test_PyListAppend_3 (PyObject *item, PyObject *list)
> +{
> +  PyList_Append (list, item);
> +  return list;
> +}
> \ No newline at end of file

[...]

Overall, I think that assuming the rebase is trivial then with the nits
fixed, this is good for trunk.  As noted above there are some issues
with the known_function implementations in the plugin, but that's a
minor detail that doesn't impact anyone else, so let's not perfect be
the enemy of the good.

Hope the above makes sense; thanks again for the patch.
Dave

Re: [PATCH v2] analyzer: More features for CPython analyzer plugin [PR107646]

Reply via email to