https://gcc.gnu.org/g:c2c64cfcd07b1060a6c16d1695972938ea643c1f
commit r16-7938-gc2c64cfcd07b1060a6c16d1695972938ea643c1f Author: David Malcolm <[email protected]> Date: Fri Mar 6 18:47:05 2026 -0500 testsuite: fix ICEs in analyzer plugin with CPython >= 3.11 [PR107646,PR112520] In GCC 14 the testsuite gained a plugin that "teaches" the analyzer about the CPython API, trying for find common mistakes: https://gcc.gnu.org/wiki/StaticAnalyzer/CPython Unfortunately, this has been crashing for more recent versions of CPython. Specifically, in Python 3.11, PyObject's ob_refcnt was moved to an anonymous union (as part of PEP 683 "Immortal Objects, Using a Fixed Refcount"). The plugin attempts to find the field but fails, but has no error-handling, leading to a null pointer dereference. Also, https://github.com/python/cpython/pull/101292 moved the "ob_digit" from struct _longobject to a new field long_value of a new struct _PyLongValue, leading to similar analyzer crashes when not finding the field. The following patch fixes this by * looking within the anonymous union for the ob_refcnt field if it can't find it directly * gracefully handling the case of not finding "ob_digit" in PyLongObject * doing more lookups once at plugin startup, rather than continuously on analyzing API calls * adding diagnostics and more error-handling to the plugin startup, so that if it can't find something in the Python headers it emits a useful note when disabling itself, e.g. cc1: note: could not find field 'ob_digit' of CPython type 'PyLongObject' {aka 'struct _longobject'} * replacing some copy-and-pasted code with member functions of a new "class api" (though various other cleanups could be done) Tested with: * CPython 3.8: all tests continue to PASS * CPython 3.13: fixes the ICEs, 2 FAILs remain (reference counting false negatives) Given that this is already a large patch, I'm opting to only fix the crashes and defer the 2 remainings FAILs and other cleanups to followup work. gcc/analyzer/ChangeLog: PR testsuite/112520 * region-model-manager.cc (region_model_manager::get_field_region): Assert that the args are non-null. gcc/testsuite/ChangeLog: PR analyzer/107646 PR testsuite/112520 * gcc.dg/plugin/analyzer_cpython_plugin.cc: Move everything from namespace ana:: into ana::cpython_plugin. Move global tree values into a new "class api". (pyobj_record): Replace with api.m_type_PyObject. (pyobj_ptr_tree): Replace with api.m_type_PyObject_ptr. (pyobj_ptr_ptr): Replace with api.m_type_PyObject_ptr_ptr. (varobj_record): Replace with api.m_type_PyVarObject. (pylistobj_record): Replace with api.m_type_PyListObject. (pylongobj_record): Replace with api.m_type_PyLongObject. (pylongtype_vardecl): Replace with api.m_vardecl_PyLong_Type. (pylisttype_vardecl): Replace with api.m_vardecl_PyList_Type. (get_field_by_name): Add "complain" param and use it to issue a note on failure. Assert that type and name are non-null. Don't crash on fields that are anonymous unions, and special-case looking within them for "ob_refcnt" to work around the Python 3.11 change for PEP 683 (immortal objects). (get_sizeof_pyobjptr): Convert to... (api::get_sval_sizeof_PyObject_ptr): ...this (init_ob_refcnt_field): Convert to... (api::init_ob_refcnt_field): ...this. (set_ob_type_field): Convert to... (api::set_ob_type_field): ..this. (api::init_PyObject_HEAD): New. (api::get_region_PyObject_ob_refcnt): New. (api::do_Py_INCREF): New. (api::get_region_PyVarObject_ob_size): New. (api::get_region_PyLongObject_ob_digit): New. (inc_field_val): Convert to... (api::inc_field_val): ...this. (refcnt_mismatch::refcnt_mismatch): Add tree params for refcounts and initialize corresponding fields. Fix whitespace. (refcnt_mismatch::emit): Use stored tree values, rather than assuming we have constants, and crashing non-constants. Delete commented-out dead code. (refcnt_mismatch::foo): Delete. (refcnt_mismatch::m_expected_refcnt_tree): New field. (refcnt_mismatch::m_actual_refcnt_tree): New field. (retrieve_ob_refcnt_sval): Simplify using class api. (count_pyobj_references): Likewise. (check_refcnt): Likewise. Don't warn on UNKNOWN values. Use get_representative_tree for the expected and actual values and skip the warning if it fails, rather than assuming we have constants and crashing on non-constants. (count_all_references): Update comment. (kf_PyList_Append::impl_call_pre): Simplify using class api. (kf_PyList_Append::impl_call_post): Likewise. (kf_PyList_New::impl_call_post): Likewise. (kf_PyLong_FromLong::impl_call_post): Likewise. (get_stashed_type_by_name): Emit note if the type couldn't be found. (get_stashed_global_var_by_name): Likewise for globals. (init_py_structs): Convert to... (api::init_from_stashed_types): ...this. Bail out with an error code if anything fails. Look up more things at startup, rather than during analysis of calls. (ana::cpython_analyzer_events_subscriber): Rename to... (ana::cpython_plugin::analyzer_events_subscriber): ...this. (analyzer_events_subscriber::analyzer_events_subscriber): Initialize m_init_failed. (analyzer_events_subscriber::on_message<on_tu_finished>): Update for conversion of init_py_structs to api::init_from_stashed_types and bail if it fails. (analyzer_events_subscriber::on_message<on_frame_popped): Don't run if plugin initialization failed. (analyzer_events_subscriber::m_init_failed): New field. Signed-off-by: David Malcolm <[email protected]> Diff: --- gcc/analyzer/region-model-manager.cc | 2 + .../gcc.dg/plugin/analyzer_cpython_plugin.cc | 643 +++++++++++++++------ 2 files changed, 459 insertions(+), 186 deletions(-) diff --git a/gcc/analyzer/region-model-manager.cc b/gcc/analyzer/region-model-manager.cc index 892d60697be2..33d9a62937c3 100644 --- a/gcc/analyzer/region-model-manager.cc +++ b/gcc/analyzer/region-model-manager.cc @@ -1689,6 +1689,8 @@ region_model_manager::get_unknown_symbolic_region (tree region_type) const region * region_model_manager::get_field_region (const region *parent, tree field) { + gcc_assert (parent); + gcc_assert (field); gcc_assert (TREE_CODE (field) == FIELD_DECL); /* (*UNKNOWN_PTR).field is (*UNKNOWN_PTR_OF_&FIELD_TYPE). */ diff --git a/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.cc b/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.cc index cde45c438763..5aa79b1e6412 100644 --- a/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.cc +++ b/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.cc @@ -51,72 +51,314 @@ int plugin_is_GPL_compatible; static GTY (()) hash_map<tree, tree> *analyzer_stashed_types; static GTY (()) hash_map<tree, tree> *analyzer_stashed_globals; -namespace ana -{ -static tree pyobj_record = NULL_TREE; -static tree pyobj_ptr_tree = NULL_TREE; -static tree pyobj_ptr_ptr = NULL_TREE; -static tree varobj_record = NULL_TREE; -static tree pylistobj_record = NULL_TREE; -static tree pylongobj_record = NULL_TREE; -static tree pylongtype_vardecl = NULL_TREE; -static tree pylisttype_vardecl = NULL_TREE; +namespace ana { +namespace cpython_plugin { static tree -get_field_by_name (tree type, const char *name) +get_field_by_name (tree type, const char *name, bool complain = true) { + gcc_assert (type); + gcc_assert (name); for (tree field = TYPE_FIELDS (type); field; field = TREE_CHAIN (field)) { if (TREE_CODE (field) == FIELD_DECL) - { - const char *field_name = IDENTIFIER_POINTER (DECL_NAME (field)); - if (strcmp (field_name, name) == 0) - return field; - } + if (tree id = DECL_NAME (field)) + { + const char *field_name = IDENTIFIER_POINTER (id); + if (strcmp (field_name, name) == 0) + return field; + } + + /* Prior to python 3.11, ob_refcnt a field of PyObject. + In Python 3.11 ob_refcnt was moved to an anonymous union within + PyObject (as part of PEP 683 "Immortal Objects, Using a + Fixed Refcount"). */ + if (0 == strcmp (name, "ob_refcnt")) + if (tree subfield = get_field_by_name (TREE_TYPE (field), name, false)) + return subfield; } + if (complain) + inform (UNKNOWN_LOCATION, "could not find field %qs of CPython type %qT", + name, type); return NULL_TREE; } -static const svalue * -get_sizeof_pyobjptr (region_model_manager *mgr) -{ - tree size_tree = TYPE_SIZE_UNIT (pyobj_ptr_tree); - const svalue *sizeof_sval = mgr->get_or_create_constant_svalue (size_tree); - return sizeof_sval; -} +/* Global state and utils for working with the CPython API. -/* Update MODEL to set OB_BASE_REGION's ob_refcnt to 1. */ -static void -init_ob_refcnt_field (region_model_manager *mgr, region_model *model, - const region *ob_base_region, tree pyobj_record, - const call_details &cd) + Attempt to provide some isolation against changes to the API in + different versions of CPython. + + This only persists during the analyzer, and thus doesn't need + to interact with the garbage collector. */ + +class api { - tree ob_refcnt_tree = get_field_by_name (pyobj_record, "ob_refcnt"); - const region *ob_refcnt_region +public: + bool init_from_stashed_types (); + + // Get RECORD_TYPE for PyObject + tree + get_type_PyObject () const + { + gcc_assert (m_type_PyObject); + return m_type_PyObject; + } + + // Get FIELD_DECL for PyObject's "ob_refcnt" + tree + get_field_PyObject_ob_refcnt () const + { + gcc_assert (m_field_PyObject_ob_refcnt); + return m_field_PyObject_ob_refcnt; + } + + // Get FIELD_DECL for PyObject's "ob_type" + tree + get_field_PyObject_ob_type () const + { + gcc_assert (m_field_PyObject_ob_type); + return m_field_PyObject_ob_type; + } + + // Get POINTER_TYPE for "PyObject *" + tree + get_type_PyObject_ptr () const + { + gcc_assert (m_type_PyObject_ptr); + return m_type_PyObject_ptr; + } + + // Get POINTER_TYPE for "PyObject **" + tree + get_type_PyObject_ptr_ptr () const + { + gcc_assert (m_type_PyObject_ptr_ptr); + return m_type_PyObject_ptr_ptr; + } + + // Get RECORD_TYPE for CPython's PyVarObject + tree + get_type_PyVarObject () const + { + gcc_assert (m_type_PyVarObject); + return m_type_PyVarObject; + } + + // Get FIELD_DECL for PyVarObject's "ob_size" + tree + get_field_PyVarObject_ob_size () const + { + gcc_assert (m_field_PyVarObject_ob_size); + return m_field_PyVarObject_ob_size; + } + + // Get RECORD_TYPE for CPython's PyListObject + tree + get_type_PyListObject () const + { + gcc_assert (m_type_PyListObject); + return m_type_PyListObject; + } + + // Get RECORD_TYPE for CPython's PyLongObject + tree + get_type_PyLongObject () const + { + gcc_assert (m_type_PyLongObject); + return m_type_PyLongObject; + } + + // Get FIELD_DECL for PyListObject's "ob_item" + tree + get_field_PyListObject_ob_item () const + { + gcc_assert (m_field_PyListObject_ob_item); + return m_field_PyListObject_ob_item; + } + + // Get VAR_DECL for CPython's global var PyList_Type + tree + get_vardecl_PyList_Type () const + { + gcc_assert (m_vardecl_PyList_Type); + return m_vardecl_PyList_Type; + } + + // Get VAR_DECL for CPython's global var PyLong_Type + tree + get_vardecl_PyLong_Type () const + { + gcc_assert (m_vardecl_PyLong_Type); + return m_vardecl_PyLong_Type; + } + + /* Get an ana::svalue for "sizeof (PyObject *)". */ + const svalue * + get_sval_sizeof_PyObject_ptr (region_model_manager *mgr) const + { + tree size_tree = TYPE_SIZE_UNIT (m_type_PyObject_ptr); + const svalue *sizeof_sval = mgr->get_or_create_constant_svalue (size_tree); + return sizeof_sval; + } + + /* Update MODEL to set OB_BASE_REGION's ob_refcnt to 1. */ + void + init_ob_refcnt_field (region_model *model, + const region *ob_base_region, + region_model_context *ctxt) + { + region_model_manager *mgr = model->get_manager (); + tree ob_refcnt_tree = get_field_PyObject_ob_refcnt (); + const region *ob_refcnt_region = mgr->get_field_region (ob_base_region, ob_refcnt_tree); - const svalue *refcnt_one_sval + const svalue *refcnt_one_sval = mgr->get_or_create_int_cst (size_type_node, 1); - model->set_value (ob_refcnt_region, refcnt_one_sval, cd.get_ctxt ()); -} + model->set_value (ob_refcnt_region, refcnt_one_sval, ctxt); + } -/* Update MODEL to set OB_BASE_REGION's ob_type to point to - PYTYPE_VAR_DECL_PTR. */ -static void -set_ob_type_field (region_model_manager *mgr, region_model *model, - const region *ob_base_region, tree pyobj_record, - tree pytype_var_decl_ptr, const call_details &cd) -{ - const region *pylist_type_region + /* Update MODEL to set OB_BASE_REGION's ob_type to point to + PYTYPE_VAR_DECL_PTR. */ + void + set_ob_type_field (region_model *model, + const region *ob_base_region, + tree pytype_var_decl_ptr, + region_model_context *ctxt) + { + region_model_manager *mgr = model->get_manager (); + const region *pylist_type_region = mgr->get_region_for_global (pytype_var_decl_ptr); - tree pytype_var_decl_ptr_type + tree pytype_var_decl_ptr_type = build_pointer_type (TREE_TYPE (pytype_var_decl_ptr)); - const svalue *pylist_type_ptr_sval + const svalue *pylist_type_ptr_sval = mgr->get_ptr_svalue (pytype_var_decl_ptr_type, pylist_type_region); - tree ob_type_field = get_field_by_name (pyobj_record, "ob_type"); - const region *ob_type_region + tree ob_type_field = get_field_PyObject_ob_type (); + const region *ob_type_region = mgr->get_field_region (ob_base_region, ob_type_field); - model->set_value (ob_type_region, pylist_type_ptr_sval, cd.get_ctxt ()); -} + model->set_value (ob_type_region, pylist_type_ptr_sval, ctxt); + } + + /* Initialize OB_BASE_REGION as a PyObject_HEAD + i.e. set "ob_refcnt" to 1 and "ob_type" to PYTYPE_VAR_DECL_PTR. */ + void + init_PyObject_HEAD (region_model *model, + const region *ob_base_region, + tree pytype_var_decl_ptr, + region_model_context *ctxt) + { + // Initialize ob_refcnt field to 1. + init_ob_refcnt_field (model, ob_base_region, ctxt); + + /* Get pointer svalue for PYTYPE_VAR_DECL_PTR then + assign it to ob_type field. */ + set_ob_type_field (model, ob_base_region, + pytype_var_decl_ptr, ctxt); + } + + // Get subregion for ob_refcnt within a PyObject instance + const region * + get_region_PyObject_ob_refcnt (region_model_manager *mgr, + const region *pyobject_instance_reg) + { + const region *ob_refcnt_region + = mgr->get_field_region (pyobject_instance_reg, + m_field_PyObject_ob_refcnt); + return ob_refcnt_region; + } + + void + do_Py_INCREF (region_model *model, + const region *pyobject_instance_reg, + region_model_context *ctxt) + { + region_model_manager *mgr = model->get_manager (); + const region *ob_refcnt_region + = get_region_PyObject_ob_refcnt (mgr, pyobject_instance_reg); + inc_field_val (model, ob_refcnt_region, size_type_node, ctxt); + } + + // Get subregion for ob_size within a PyVarObject instance + const region * + get_region_PyVarObject_ob_size (region_model_manager *mgr, + const region *pyvarobject_instance_reg) + { + const region *ob_size_region + = mgr->get_field_region (pyvarobject_instance_reg, + m_field_PyVarObject_ob_size); + return ob_size_region; + } + + /* Attempt to get the subregion for the "ob_digit" field within + PYLONG_REGION, a PyLongObject instance. */ + const region * + get_region_PyLongObject_ob_digit (region_model_manager *mgr, + const region *pylong_region) + { + if (tree ob_digit_field + = get_field_by_name (m_type_PyLongObject, "ob_digit", false)) + { + const region *ob_digit_region + = mgr->get_field_region (pylong_region, ob_digit_field); + return ob_digit_region; + } + + /* TODO: https://github.com/python/cpython/pull/101292 + moved "ob_digit" from struct _longobject to a new field long_value + of a new struct _PyLongValue. */ + + return nullptr; + } + + /* Increment the value of FIELD_REGION in the MODEL by 1. Optionally + capture the old and new svalues if OLD_SVAL and NEW_SVAL pointers are + provided. */ + void + inc_field_val (region_model *model, + const region *field_region, const tree type_node, + region_model_context *ctxt, + const svalue **old_sval = nullptr, + const svalue **new_sval = nullptr) + { + region_model_manager *mgr = model->get_manager (); + const svalue *tmp_old_sval = model->get_store_value (field_region, ctxt); + const svalue *one_sval = mgr->get_or_create_int_cst (type_node, 1); + const svalue *tmp_new_sval + = mgr->get_or_create_binop (type_node, PLUS_EXPR, tmp_old_sval, one_sval); + + model->set_value (field_region, tmp_new_sval, ctxt); + + if (old_sval) + *old_sval = tmp_old_sval; + + if (new_sval) + *new_sval = tmp_new_sval; + } + +private: + // PyObject + tree m_type_PyObject; // RECORD_TYPE + tree m_field_PyObject_ob_refcnt; // FIELD_DECL + tree m_field_PyObject_ob_type; // FIELD_DECL + + // POINTER_TYPE for "PyObject *" + tree m_type_PyObject_ptr; + + // POINTER_TYPE for "PyObject **" + tree m_type_PyObject_ptr_ptr; + + // PyVarObject + tree m_type_PyVarObject; // RECORD_TYPE + tree m_field_PyVarObject_ob_size; // FIELD_DECL + + // PyListObject + tree m_type_PyListObject; // RECORD_TYPE + tree m_field_PyListObject_ob_item; // FIELD_DECL + tree m_vardecl_PyList_Type; // VAR_DECL for CPython's global var PyList_Type + + // PyLongObject + tree m_type_PyLongObject; // RECORD_TYPE + tree m_vardecl_PyLong_Type; // VAR_DECL for CPython's global var PyLong_Type + +} api; /* Retrieve the "ob_base" field's region from OBJECT_RECORD within NEW_OBJECT_REGION and set its value in the MODEL to PYOBJ_SVALUE. */ @@ -132,7 +374,7 @@ get_ob_base_region (region_model_manager *mgr, region_model *model, return ob_base_region; } -/* Initialize and retrieve a region within the MODEL for a PyObject +/* Initialize and retrieve a region within the MODEL for a PyObject and set its value to OBJECT_SVALUE. */ static const region * init_pyobject_region (region_model_manager *mgr, region_model *model, @@ -144,30 +386,6 @@ init_pyobject_region (region_model_manager *mgr, region_model *model, return pyobject_region; } -/* Increment the value of FIELD_REGION in the MODEL by 1. Optionally - capture the old and new svalues if OLD_SVAL and NEW_SVAL pointers are - provided. */ -static void -inc_field_val (region_model_manager *mgr, region_model *model, - const region *field_region, const tree type_node, - const call_details &cd, const svalue **old_sval = nullptr, - const svalue **new_sval = nullptr) -{ - const svalue *tmp_old_sval - = model->get_store_value (field_region, cd.get_ctxt ()); - const svalue *one_sval = mgr->get_or_create_int_cst (type_node, 1); - const svalue *tmp_new_sval = mgr->get_or_create_binop ( - type_node, PLUS_EXPR, tmp_old_sval, one_sval); - - model->set_value (field_region, tmp_new_sval, cd.get_ctxt ()); - - if (old_sval) - *old_sval = tmp_old_sval; - - if (new_sval) - *new_sval = tmp_new_sval; -} - class pyobj_init_fail : public failed_call_info { public: @@ -194,11 +412,17 @@ class refcnt_mismatch : public pending_diagnostic_subclass<refcnt_mismatch> { public: refcnt_mismatch (const region *base_region, - const svalue *ob_refcnt, - const svalue *actual_refcnt, - tree reg_tree) - : m_base_region (base_region), m_ob_refcnt (ob_refcnt), - m_actual_refcnt (actual_refcnt), m_reg_tree(reg_tree) + const svalue *ob_refcnt, + const svalue *actual_refcnt, + tree reg_tree, + tree expected_refcnt_tree, + tree actual_refcnt_tree) + : m_base_region (base_region), + m_ob_refcnt (ob_refcnt), + m_actual_refcnt (actual_refcnt), + m_reg_tree (reg_tree), + m_expected_refcnt_tree (expected_refcnt_tree), + m_actual_refcnt_tree (actual_refcnt_tree) { } @@ -225,16 +449,11 @@ public: emit (diagnostic_emission_context &ctxt) final override { bool warned; - // just assuming constants for now - auto actual_refcnt - = m_actual_refcnt->dyn_cast_constant_svalue ()->get_constant (); - auto ob_refcnt = m_ob_refcnt->dyn_cast_constant_svalue ()->get_constant (); warned = ctxt.warn ("expected %qE to have " "reference count: %qE but ob_refcnt field is: %qE", - m_reg_tree, actual_refcnt, ob_refcnt); - - // location_t loc = rich_loc->get_loc (); - // foo (loc); + m_reg_tree, + m_actual_refcnt_tree, + m_expected_refcnt_tree); return warned; } @@ -245,15 +464,12 @@ public: } private: - - void foo(location_t loc) const - { - inform(loc, "something is up right here"); - } const region *m_base_region; const svalue *m_ob_refcnt; const svalue *m_actual_refcnt; tree m_reg_tree; + tree m_expected_refcnt_tree; + tree m_actual_refcnt_tree; }; /* Retrieves the svalue associated with the ob_refcnt field of the base region. @@ -263,7 +479,7 @@ retrieve_ob_refcnt_sval (const region *base_reg, const region_model *model, region_model_context *ctxt) { region_model_manager *mgr = model->get_manager (); - tree ob_refcnt_tree = get_field_by_name (pyobj_record, "ob_refcnt"); + tree ob_refcnt_tree = api.get_field_PyObject_ob_refcnt (); const region *ob_refcnt_region = mgr->get_field_region (base_reg, ob_refcnt_tree); const svalue *ob_refcnt_sval @@ -310,7 +526,7 @@ count_pyobj_references (const region_model *model, seen.add (pyobj_region); - if (pyobj_ptr_sval->get_type () == pyobj_ptr_tree) + if (pyobj_ptr_sval->get_type () == api.get_type_PyObject_ptr ()) increment_region_refcnt (region_to_refcnt, pyobj_region); const auto *curr_store = model->get_store (); @@ -347,18 +563,30 @@ check_refcnt (const region_model *model, const svalue *actual_refcnt_sval = mgr->get_or_create_int_cst ( ob_refcnt_sval->get_type (), actual_refcnt); - if (ob_refcnt_sval != actual_refcnt_sval) + if (ob_refcnt_sval != actual_refcnt_sval + && ob_refcnt_sval->get_kind () != SK_UNKNOWN + && actual_refcnt_sval->get_kind () != SK_UNKNOWN) { const svalue *curr_reg_sval - = mgr->get_ptr_svalue (pyobj_ptr_tree, curr_region); + = mgr->get_ptr_svalue (api.get_type_PyObject_ptr (), curr_region); tree reg_tree = old_model->get_representative_tree (curr_reg_sval); if (!reg_tree) return; + tree expected_refcnt_tree + = old_model->get_representative_tree (ob_refcnt_sval); + if (!expected_refcnt_tree) + return; + tree actual_refcnt_tree + = old_model->get_representative_tree (actual_refcnt_sval); + if (!actual_refcnt_tree) + return; const auto &eg = ctxt->get_eg (); auto pd = std::make_unique<refcnt_mismatch> (curr_region, ob_refcnt_sval, actual_refcnt_sval, - reg_tree); + reg_tree, + expected_refcnt_tree, + actual_refcnt_tree); if (pd && eg) ctxt->warn (std::move (pd), make_ploc_fixer_for_epath_for_leak_diagnostic (*eg, @@ -417,7 +645,7 @@ count_all_references (const region_model *model, const svalue *unwrapped_sval = binding_sval->unwrap_any_unmergeable (); - // if (unwrapped_sval->get_type () != pyobj_ptr_tree) + // if (unwrapped_sval->get_type () != api.m_type_PyObject_ptr) // continue; const region *pointee = unwrapped_sval->maybe_get_region (); @@ -520,15 +748,15 @@ kf_PyList_Append::impl_call_pre (const call_details &cd) const return; // PyList_Check - tree ob_type_field = get_field_by_name (pyobj_record, "ob_type"); + tree ob_type_field = api.get_field_PyObject_ob_type (); const region *ob_type_region = mgr->get_field_region (pylist_reg, ob_type_field); const svalue *stored_sval = model->get_store_value (ob_type_region, cd.get_ctxt ()); const region *pylist_type_region - = mgr->get_region_for_global (pylisttype_vardecl); + = mgr->get_region_for_global (api.get_vardecl_PyList_Type ()); tree pylisttype_vardecl_ptr - = build_pointer_type (TREE_TYPE (pylisttype_vardecl)); + = build_pointer_type (TREE_TYPE (api.get_vardecl_PyList_Type ())); const svalue *pylist_type_ptr = mgr->get_ptr_svalue (pylisttype_vardecl_ptr, pylist_type_region); @@ -578,7 +806,7 @@ kf_PyList_Append::impl_call_post (const call_details &cd) const pylist_sval, cd.get_arg_tree (0), cd.get_ctxt ()); /* Identify ob_item field and set it to NULL. */ - tree ob_item_field = get_field_by_name (pylistobj_record, "ob_item"); + tree ob_item_field = api.get_field_PyListObject_ob_item (); const region *ob_item_reg = mgr->get_field_region (pylist_reg, ob_item_field); const svalue *old_ptr_sval @@ -592,7 +820,7 @@ kf_PyList_Append::impl_call_post (const call_details &cd) const model->unset_dynamic_extents (freed_reg); } - const svalue *null_sval = mgr->get_or_create_null_ptr (pyobj_ptr_ptr); + const svalue *null_sval = mgr->get_or_create_null_ptr (api.get_type_PyObject_ptr_ptr ()); model->set_value (ob_item_reg, null_sval, cd.get_ctxt ()); if (cd.get_lhs_type ()) @@ -633,20 +861,19 @@ kf_PyList_Append::impl_call_post (const call_details &cd) const const region *newitem_reg = model->deref_rvalue ( newitem_sval, cd.get_arg_tree (1), cd.get_ctxt ()); - tree ob_size_field = get_field_by_name (varobj_record, "ob_size"); const region *ob_size_region - = mgr->get_field_region (pylist_reg, ob_size_field); + = api.get_region_PyVarObject_ob_size (mgr, pylist_reg); const svalue *ob_size_sval = nullptr; const svalue *new_size_sval = nullptr; - inc_field_val (mgr, model, ob_size_region, integer_type_node, cd, - &ob_size_sval, &new_size_sval); + api.inc_field_val (model, ob_size_region, integer_type_node, ctxt, + &ob_size_sval, &new_size_sval); const svalue *sizeof_sval = mgr->get_or_create_cast ( - ob_size_sval->get_type (), get_sizeof_pyobjptr (mgr)); + ob_size_sval->get_type (), api.get_sval_sizeof_PyObject_ptr (mgr)); const svalue *num_allocated_bytes = mgr->get_or_create_binop ( size_type_node, MULT_EXPR, sizeof_sval, new_size_sval); - tree ob_item_field = get_field_by_name (pylistobj_record, "ob_item"); + tree ob_item_field = api.get_field_PyListObject_ob_item (); const region *ob_item_region = mgr->get_field_region (pylist_reg, ob_item_field); const svalue *ob_item_ptr_sval @@ -655,7 +882,7 @@ kf_PyList_Append::impl_call_post (const call_details &cd) const /* We can only grow in place with a non-NULL pointer and no unknown */ { - const svalue *null_ptr = mgr->get_or_create_null_ptr (pyobj_ptr_ptr); + const svalue *null_ptr = mgr->get_or_create_null_ptr (api.get_type_PyObject_ptr_ptr ()); if (!model->add_constraint (ob_item_ptr_sval, NE_EXPR, null_ptr, cd.get_ctxt ())) { @@ -696,14 +923,11 @@ kf_PyList_Append::impl_call_post (const call_details &cd) const const svalue *offset_sval = mgr->get_or_create_binop ( size_type_node, MULT_EXPR, sizeof_sval, ob_size_sval); const region *element_region - = mgr->get_offset_region (curr_reg, pyobj_ptr_ptr, offset_sval); + = mgr->get_offset_region (curr_reg, api.get_type_PyObject_ptr_ptr (), offset_sval); model->set_value (element_region, newitem_sval, cd.get_ctxt ()); // Increment ob_refcnt of appended item. - tree ob_refcnt_tree = get_field_by_name (pyobj_record, "ob_refcnt"); - const region *ob_refcnt_region - = mgr->get_field_region (newitem_reg, ob_refcnt_tree); - inc_field_val (mgr, model, ob_refcnt_region, size_type_node, cd); + api.do_Py_INCREF (model, newitem_reg, ctxt); if (cd.get_lhs_type ()) { @@ -742,20 +966,19 @@ kf_PyList_Append::impl_call_post (const call_details &cd) const const region *newitem_reg = model->deref_rvalue ( newitem_sval, cd.get_arg_tree (1), cd.get_ctxt ()); - tree ob_size_field = get_field_by_name (varobj_record, "ob_size"); const region *ob_size_region - = mgr->get_field_region (pylist_reg, ob_size_field); + = api.get_region_PyVarObject_ob_size (mgr, pylist_reg); const svalue *old_ob_size_sval = nullptr; const svalue *new_ob_size_sval = nullptr; - inc_field_val (mgr, model, ob_size_region, integer_type_node, cd, - &old_ob_size_sval, &new_ob_size_sval); + api.inc_field_val (model, ob_size_region, integer_type_node, ctxt, + &old_ob_size_sval, &new_ob_size_sval); const svalue *sizeof_sval = mgr->get_or_create_cast ( - old_ob_size_sval->get_type (), get_sizeof_pyobjptr (mgr)); + old_ob_size_sval->get_type (), api.get_sval_sizeof_PyObject_ptr (mgr)); const svalue *new_size_sval = mgr->get_or_create_binop ( size_type_node, MULT_EXPR, sizeof_sval, new_ob_size_sval); - tree ob_item_field = get_field_by_name (pylistobj_record, "ob_item"); + tree ob_item_field = api.get_field_PyListObject_ob_item (); const region *ob_item_reg = mgr->get_field_region (pylist_reg, ob_item_field); const svalue *old_ptr_sval @@ -765,7 +988,7 @@ kf_PyList_Append::impl_call_post (const call_details &cd) const const region *new_reg = model->get_or_create_region_for_heap_alloc ( new_size_sval, cd.get_ctxt ()); const svalue *new_ptr_sval - = mgr->get_ptr_svalue (pyobj_ptr_ptr, new_reg); + = mgr->get_ptr_svalue (api.get_type_PyObject_ptr_ptr (), new_reg); if (!model->add_constraint (new_ptr_sval, NE_EXPR, old_ptr_sval, cd.get_ctxt ())) return false; @@ -780,11 +1003,11 @@ kf_PyList_Append::impl_call_post (const call_details &cd) const const svalue *copied_size_sval = get_copied_size (model, old_size_sval, new_size_sval); const region *copied_old_reg = mgr->get_sized_region ( - freed_reg, pyobj_ptr_ptr, copied_size_sval); + freed_reg, api.get_type_PyObject_ptr_ptr (), copied_size_sval); const svalue *buffer_content_sval = model->get_store_value (copied_old_reg, cd.get_ctxt ()); const region *copied_new_reg = mgr->get_sized_region ( - new_reg, pyobj_ptr_ptr, copied_size_sval); + new_reg, api.get_type_PyObject_ptr_ptr (), copied_size_sval); model->set_value (copied_new_reg, buffer_content_sval, cd.get_ctxt ()); } @@ -797,7 +1020,7 @@ kf_PyList_Append::impl_call_post (const call_details &cd) const model->unset_dynamic_extents (freed_reg); } - const svalue *null_ptr = mgr->get_or_create_null_ptr (pyobj_ptr_ptr); + const svalue *null_ptr = mgr->get_or_create_null_ptr (api.get_type_PyObject_ptr_ptr ()); if (!model->add_constraint (new_ptr_sval, NE_EXPR, null_ptr, cd.get_ctxt ())) return false; @@ -808,14 +1031,11 @@ kf_PyList_Append::impl_call_post (const call_details &cd) const const svalue *offset_sval = mgr->get_or_create_binop ( size_type_node, MULT_EXPR, sizeof_sval, old_ob_size_sval); const region *element_region - = mgr->get_offset_region (new_reg, pyobj_ptr_ptr, offset_sval); + = mgr->get_offset_region (new_reg, api.get_type_PyObject_ptr_ptr (), offset_sval); model->set_value (element_region, newitem_sval, cd.get_ctxt ()); // Increment ob_refcnt of appended item. - tree ob_refcnt_tree = get_field_by_name (pyobj_record, "ob_refcnt"); - const region *ob_refcnt_region - = mgr->get_field_region (newitem_reg, ob_refcnt_tree); - inc_field_val (mgr, model, ob_refcnt_region, size_type_node, cd); + api.do_Py_INCREF (model, newitem_reg, ctxt); if (cd.get_lhs_type ()) { @@ -885,11 +1105,11 @@ kf_PyList_New::impl_call_post (const call_details &cd) const region_model_manager *mgr = cd.get_manager (); const svalue *pyobj_svalue - = mgr->get_or_create_unknown_svalue (pyobj_record); + = mgr->get_or_create_unknown_svalue (api.get_type_PyObject ()); const svalue *varobj_svalue - = mgr->get_or_create_unknown_svalue (varobj_record); + = mgr->get_or_create_unknown_svalue (api.get_type_PyVarObject ()); const svalue *pylist_svalue - = mgr->get_or_create_unknown_svalue (pylistobj_record); + = mgr->get_or_create_unknown_svalue (api.get_type_PyListObject ()); const svalue *size_sval = cd.get_arg_svalue (0); @@ -904,12 +1124,13 @@ kf_PyList_New::impl_call_post (const call_details &cd) const Py_ssize_t allocated; } PyListObject; */ - tree varobj_field = get_field_by_name (pylistobj_record, "ob_base"); + tree varobj_field = get_field_by_name (api.get_type_PyListObject (), + "ob_base"); const region *varobj_region = mgr->get_field_region (pylist_region, varobj_field); model->set_value (varobj_region, varobj_svalue, cd.get_ctxt ()); - tree ob_item_field = get_field_by_name (pylistobj_record, "ob_item"); + tree ob_item_field = api.get_field_PyListObject_ob_item (); const region *ob_item_region = mgr->get_field_region (pylist_region, ob_item_field); @@ -925,13 +1146,13 @@ kf_PyList_New::impl_call_post (const call_details &cd) const integer_one_node)) { const svalue *null_sval - = mgr->get_or_create_null_ptr (pyobj_ptr_ptr); + = mgr->get_or_create_null_ptr (api.get_type_PyObject_ptr_ptr ()); model->set_value (ob_item_region, null_sval, cd.get_ctxt ()); } else // calloc { const svalue *sizeof_sval = mgr->get_or_create_cast ( - size_sval->get_type (), get_sizeof_pyobjptr (mgr)); + size_sval->get_type (), api.get_sval_sizeof_PyObject_ptr (mgr)); const svalue *prod_sval = mgr->get_or_create_binop ( size_type_node, MULT_EXPR, sizeof_sval, size_sval); const region *ob_item_sized_region @@ -939,7 +1160,8 @@ kf_PyList_New::impl_call_post (const call_details &cd) const cd.get_ctxt ()); model->zero_fill_region (ob_item_sized_region, cd.get_ctxt ()); const svalue *ob_item_ptr_sval - = mgr->get_ptr_svalue (pyobj_ptr_ptr, ob_item_sized_region); + = mgr->get_ptr_svalue (api.get_type_PyObject_ptr_ptr (), + ob_item_sized_region); const svalue *ob_item_unmergeable = mgr->get_or_create_unmergeable (ob_item_ptr_sval); model->set_value (ob_item_region, ob_item_unmergeable, @@ -952,27 +1174,17 @@ kf_PyList_New::impl_call_post (const call_details &cd) const Py_ssize_t ob_size; // Number of items in variable part } PyVarObject; */ - const region *ob_base_region = get_ob_base_region ( - mgr, model, varobj_region, varobj_record, pyobj_svalue, cd); + const region *ob_base_region + = get_ob_base_region (mgr, model, varobj_region, + api.get_type_PyVarObject (), + pyobj_svalue, cd); - tree ob_size_tree = get_field_by_name (varobj_record, "ob_size"); const region *ob_size_region - = mgr->get_field_region (varobj_region, ob_size_tree); + = api.get_region_PyVarObject_ob_size (mgr, varobj_region); model->set_value (ob_size_region, size_sval, cd.get_ctxt ()); - /* - typedef struct _object { - _PyObject_HEAD_EXTRA - Py_ssize_t ob_refcnt; - PyTypeObject *ob_type; - } PyObject; - */ - - // Initialize ob_refcnt field to 1. - init_ob_refcnt_field(mgr, model, ob_base_region, pyobj_record, cd); - - // Get pointer svalue for PyList_Type then assign it to ob_type field. - set_ob_type_field(mgr, model, ob_base_region, pyobj_record, pylisttype_vardecl, cd); + api.init_PyObject_HEAD (model, ob_base_region, + api.get_vardecl_PyList_Type (), ctxt); if (cd.get_lhs_type ()) { @@ -1019,29 +1231,27 @@ kf_PyLong_FromLong::impl_call_post (const call_details &cd) const region_model_manager *mgr = cd.get_manager (); const svalue *pyobj_svalue - = mgr->get_or_create_unknown_svalue (pyobj_record); + = mgr->get_or_create_unknown_svalue (api.get_type_PyObject ()); const svalue *pylongobj_sval - = mgr->get_or_create_unknown_svalue (pylongobj_record); + = mgr->get_or_create_unknown_svalue (api.get_type_PyLongObject ()); const region *pylong_region = init_pyobject_region (mgr, model, pylongobj_sval, cd); // Create a region for the base PyObject within the PyLongObject. const region *ob_base_region = get_ob_base_region ( - mgr, model, pylong_region, pylongobj_record, pyobj_svalue, cd); - - // Initialize ob_refcnt field to 1. - init_ob_refcnt_field(mgr, model, ob_base_region, pyobj_record, cd); + mgr, model, pylong_region, api.get_type_PyLongObject (), pyobj_svalue, cd); - // Get pointer svalue for PyLong_Type then assign it to ob_type field. - set_ob_type_field(mgr, model, ob_base_region, pyobj_record, pylongtype_vardecl, cd); + api.init_PyObject_HEAD (model, ob_base_region, + api.get_vardecl_PyLong_Type (), ctxt); // Set the PyLongObject value. - tree ob_digit_field = get_field_by_name (pylongobj_record, "ob_digit"); - const region *ob_digit_region - = mgr->get_field_region (pylong_region, ob_digit_field); - const svalue *ob_digit_sval = cd.get_arg_svalue (0); - model->set_value (ob_digit_region, ob_digit_sval, cd.get_ctxt ()); + if (const region *ob_digit_region + = api.get_region_PyLongObject_ob_digit (mgr, pylong_region)) + { + const svalue *ob_digit_sval = cd.get_arg_svalue (0); + model->set_value (ob_digit_region, ob_digit_sval, cd.get_ctxt ()); + } if (cd.get_lhs_type ()) { @@ -1138,6 +1348,7 @@ get_stashed_type_by_name (const char *name) gcc_assert (TREE_CODE (*slot) == RECORD_TYPE); return *slot; } + inform (UNKNOWN_LOCATION, "could not find CPython type %qs", name); return NULL_TREE; } @@ -1152,24 +1363,72 @@ get_stashed_global_var_by_name (const char *name) gcc_assert (TREE_CODE (*slot) == VAR_DECL); return *slot; } + inform (UNKNOWN_LOCATION, "could not find CPython global %qs", name); return NULL_TREE; } -static void -init_py_structs () +/* Attempt to find the various types for the CPython API, which + we hope were stashed when the frontend ran. + + Return true if we found enough to run the plugin, + false if it doesn't make sense to run it. */ + +bool +api::init_from_stashed_types () { - pyobj_record = get_stashed_type_by_name ("PyObject"); - varobj_record = get_stashed_type_by_name ("PyVarObject"); - pylistobj_record = get_stashed_type_by_name ("PyListObject"); - pylongobj_record = get_stashed_type_by_name ("PyLongObject"); - pylongtype_vardecl = get_stashed_global_var_by_name ("PyLong_Type"); - pylisttype_vardecl = get_stashed_global_var_by_name ("PyList_Type"); - - if (pyobj_record) - { - pyobj_ptr_tree = build_pointer_type (pyobj_record); - pyobj_ptr_ptr = build_pointer_type (pyobj_ptr_tree); - } + memset (this, 0, sizeof (*this)); + + m_type_PyObject = get_stashed_type_by_name ("PyObject"); + if (!m_type_PyObject) + return false; + gcc_assert (TREE_CODE (m_type_PyObject) == RECORD_TYPE); + + m_field_PyObject_ob_refcnt + = get_field_by_name (m_type_PyObject, "ob_refcnt"); + if (!m_field_PyObject_ob_refcnt) + return false; + + m_field_PyObject_ob_type + = get_field_by_name (m_type_PyObject, "ob_type"); + if (!m_field_PyObject_ob_type) + return false; + + /* PyVarObject. */ + m_type_PyVarObject = get_stashed_type_by_name ("PyVarObject"); + if (!m_type_PyVarObject) + return false; + gcc_assert (TREE_CODE (m_type_PyVarObject) == RECORD_TYPE); + m_field_PyVarObject_ob_size + = get_field_by_name (m_type_PyVarObject, "ob_size"); + if (!m_field_PyVarObject_ob_size) + return false; + + /* PyListObject. */ + m_type_PyListObject = get_stashed_type_by_name ("PyListObject"); + if (!m_type_PyListObject) + return false; + m_field_PyListObject_ob_item + = get_field_by_name (m_type_PyListObject, "ob_item"); + if (!m_field_PyListObject_ob_item) + return false; + + m_vardecl_PyList_Type = get_stashed_global_var_by_name ("PyList_Type"); + if (!m_vardecl_PyList_Type) + return false; + + /* PyLongObject. */ + m_type_PyLongObject = get_stashed_type_by_name ("PyLongObject"); + if (!m_type_PyLongObject) + return false; + + m_vardecl_PyLong_Type = get_stashed_global_var_by_name ("PyLong_Type"); + if (!m_vardecl_PyLong_Type) + return false; + + m_type_PyObject_ptr = build_pointer_type (m_type_PyObject); + m_type_PyObject_ptr_ptr = build_pointer_type (m_type_PyObject_ptr); + + return true; } void @@ -1182,9 +1441,14 @@ sorry_no_cpython_plugin () namespace analyzer_events = ::gcc::topics::analyzer_events; -class cpython_analyzer_events_subscriber : public analyzer_events::subscriber +class analyzer_events_subscriber : public analyzer_events::subscriber { public: + analyzer_events_subscriber () + : m_init_failed (false) + { + } + void on_message (const analyzer_events::on_tu_finished &msg) final override { @@ -1198,13 +1462,13 @@ public: { LOG_SCOPE (m.get_logger ()); - init_py_structs (); - - if (pyobj_record == NULL_TREE) + if (!api.init_from_stashed_types ()) { sorry_no_cpython_plugin (); + m_init_failed = true; return; } + gcc_assert (api.get_type_PyObject ()); m.register_known_function ("PyList_Append", std::make_unique<kf_PyList_Append> ()); @@ -1220,13 +1484,19 @@ public: void on_message (const analyzer_events::on_frame_popped &msg) final override { + if (m_init_failed) + return; pyobj_refcnt_checker (msg.m_new_model, msg.m_old_model, msg.m_retval, msg.m_ctxt); } -} cpython_sub; +private: + bool m_init_failed; +} event_sub; + +} // namespace ana::cpython_plugin } // namespace ana #endif /* #if ENABLE_ANALYZER */ @@ -1239,7 +1509,8 @@ plugin_init (struct plugin_name_args *plugin_info, const char *plugin_name = plugin_info->base_name; if (0) inform (input_location, "got here; %qs", plugin_name); - g->get_channels ().analyzer_events_channel.add_subscriber (ana::cpython_sub); + g->get_channels ().analyzer_events_channel.add_subscriber + (ana::cpython_plugin::event_sub); #else sorry_no_analyzer (); #endif
