[ https://issues.apache.org/jira/browse/DISPATCH-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jiri Daněk updated DISPATCH-1962: --------------------------------- Summary: Make LSan suppressions more targeted and specific (within reason) (was: Make LSan suppressions more targetted and specific (within reason)) > Make LSan suppressions more targeted and specific (within reason) > ----------------------------------------------------------------- > > Key: DISPATCH-1962 > URL: https://issues.apache.org/jira/browse/DISPATCH-1962 > Project: Qpid Dispatch > Issue Type: Test > Affects Versions: 1.15.0 > Reporter: Jiri Daněk > Priority: Major > Attachments: dispatch_leaking_agent.png > > > There are some unfortunate suppressions in {{tests/lsan.supp}} at this moment: > {code} > leak:*libwebsockets* > {code} > This is way too broad. It suppresses leaks in Dispatch files, since it > matches e.g. {{src/http-libwebsockets.c}}. > The stars at the beginning and end are actually assumed implicitly. If you do > not want substring match, you have to do ^foo$. LSan suppression format is > simplistic, very unlike Valgrind's. > {code} > leak:run_unit_tests.c > {code} > Same thing, any leaks revealed by running unit_tests get suppressed. This > suppression suppresses all leak traces that include run_unit_tests.c anywhere > in the stack. What't the point of running such tests under leak detector, > then? > {code} > leak:run_unit_tests.c > leak:^libqpid-proton.so$ > {code} > Same thing. The patterns suppress all leaks that include Python (or Proton) > anywhere in the stacktrace. That means there are huge blind spots where > dispatch leaks can hide. This is a weakness of the lsan.supp syntax (Valgrind > suppressions can be much more targeted and discerning). > h3. Python leaks > Leaks are known and there is ongoing effort to fight them: > https://bugs.python.org/issue1635741 (https://bugs.python.org/issue25302) and > https://www.python.org/dev/peps/pep-3121 > Here's valgrind suppression file from somebody who actually investigated the > Python leaks and identified the harmless ones: > https://github.com/libgit2/pygit2/blob/master/misc/valgrind-python.supp > One example of a hidden leak in dispatch, which is revealed by making Python > suppressions more targetted: > {code} > 9: Direct leak of 56 byte(s) in 1 object(s) allocated from: > 9: #0 0x7f78a3606e8f in __interceptor_malloc > (/nix/store/g40sl3zh3nv52vj0mrl4iki5iphh5ika-gcc-10.2.0-lib/lib/libasan.so.6+0xace8f) > 9: #1 0x7f78a2d64afb in qd_malloc ../include/qpid/dispatch/ctools.h:229 > 9: #2 0x7f78a2d657da in qdr_core_subscribe > ../src/router_core/route_tables.c:149 > 9: #3 0x7f78a2c83072 in IoAdapter_init ../src/python_embedded.c:711 > 9: #4 0x7f78a2353a6c in type_call > (/nix/store/r85nxfnwiv45nbmf5yb60jj8ajim4m7w-python3-3.8.5/lib/libpython3.8.so.1.0+0x165a6c) > {code} > The problem is in > {code} > class Agent: > ... > def activate(self, address): > ... > self.io = IoAdapter(self.receive, address, 'L', '0', > TREATMENT_ANYCAST_CLOSEST) > {code} > IoAdapter refers to Agent (through the bound method reference self.receive) > and Agent refers to IoAdapter (through property self.io). Since IoAdapter is > implemented in C and does not implement support for Python's cyclic GC, there > is no way to break the cycle. > Heap dump in attachment. The bound method is at the top of the picture. > (Ignore the Mock objects, I was trying to simplify the picture while not > getting crashes due to too much meddling). > h3. Random observations > It is possible to build special Debug build of Python, which has tools to > detect leaks, asserts to prevent negative refcounts, etc. > https://pythonextensionpatterns.readthedocs.io/en/latest/debugging/debug_python.html#debug-version-of-python-memory-alloc-label > Use the following to detect python leaks (instead of valgrind) > https://docs.python.org/3/library/tracemalloc.html > Use https://pypi.org/project/objgraph (with graphviz) to view heap object > trees. The following renders the picture as a png under /tmp and prints the > path to stdout. > {code} > int ret = PyRun_SimpleString("import objgraph; > objgraph.show_backrefs(config.global_agent, max_depth=10)\n\n"); > PyErr_PrintEx(0); > assert(ret == 0); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org For additional commands, e-mail: dev-h...@qpid.apache.org