[...] > As noted in our chat earlier, I don't think we can easily make these > work. Looking at CPython's implementation: PyList_Type's initializer > here: > https://github.com/python/cpython/blob/main/Objects/listobject.c#L3101 > initializes tp_flags with the flags, but: > (a) we don't see that code when compiling a user's extension module > (b) even if we did, PyList_Type is non-const, so the analyzer has to > assume that tp_flags could have been written to since it was > initialized > > In theory we could specialcase such lookups, so that, say, a plugin > could register assumptions into the analyzer about the value of bits > within (PyList_Type.tp_flags). > > However, this seems like a future feature.
I agree that it is more appropriate as a future feature. Recently, in preparation for a patch, I have been focusing on migrating as much of our plugin-specific functionality as possible, which is currently scattered across core analyzer files for convenience, into the plugin itself. Specifically, I am currently trying to transfer the code related to stashing Python-specific types and global variables into analyzer_cpython_plugin.c. This approach has three main benefits, among which some I believe we have previously discussed: 1) We only need to search for these values when initializing our plugin, instead of every time the analyzer is enabled. 2) We can extend the values that we stash by modifying only our plugin, avoiding changes to core analyzer files such as analyzer-language.cc, which seems a safer and more resilient approach. 3) Future analyzer plugins will have an easier time stashing values relevant to their respective projects. Let me know if my concerns or reasons appear unfounded. My initial approach involved adding a hook to the end of ana::on_finish_translation_unit which calls the relevant stashing-related callbacks registered during plugin initialization. Here's a rough sketch: void on_finish_translation_unit (const translation_unit &tu) { // ... existing code stash_named_constants (the_logger.get_logger (), tu); do_finish_translation_unit_callbacks(the_logger.get_logger (), tu); } Inside do_finish_translation_unit_callbacks we have a loop like so: for (auto& callback : finish_translation_unit_callbacks) { callback(logger, tu); } Where finish_translation_unit_callbacks is a vector defined as follows: typedef void (*finish_translation_unit_callback) (logger *, const translation_unit &); vec<finish_translation_unit_callback> *finish_translation_unit_callbacks; To register a callback, we use: void register_finish_translation_unit_callback ( finish_translation_unit_callback callback) { if (!finish_translation_unit_callbacks) vec_alloc (finish_translation_unit_callbacks, 1); finish_translation_unit_callbacks->safe_push (callback); } And finally, from our plugin (or any other plugin), we can register callbacks like so: ana::register_finish_translation_unit_callback (&stash_named_types); ana::register_finish_translation_unit_callback (&stash_global_vars); However, on_finish_translation_unit runs before plugin initialization occurs, so, unfortunately, we would be registering our callbacks after on_finish_translation_unit with this method. As a workaround, I tried saving the translation unit like this: void on_finish_translation_unit (const translation_unit &tu) { // ... existing code stash_named_constants (the_logger.get_logger (), tu); saved_tu = &tu; } Then in our plugin: ana::register_finish_translation_unit_callback (&stash_named_types); ana::register_finish_translation_unit_callback (&stash_global_vars); ana:: do_finish_translation_unit_callbacks(); With do_finish_translation_units passing the stored_tu to the callbacks. Unfortunately, with this method, it seems like we encounter a segmentation fault when trying to call the lookup functions within translation_unit at the time of plugin initialization, even though the translation unit is stored correctly. So it seems like the solution may not be quite so simple. I'm currently investigating this issue, but if there's an obvious solution that I might be missing or any general suggestions, please let me know! Thanks as always, Eric