[Thomas Wouters] > ... > The cycle this nested generator creates, which is also involved in the > test_tee > leak, is not cleanable by the cycle-gc, and it looks like it hasn't been > since the yield-expr/coroutine patch was included in the trunk.
That could very well be. Adding finalizers to generators could legitimately make some cycles "suddenly" uncollectable. There was a burst of list traffic about this on the 18th and 19th of June, 2005, under subject: gcmodule issue w/adding __del__ to generator objects I starred it in gmail at the time (which is why I was able to find it again ;-)), but never had time to pay attention. > I believe the culprit to be this part of that patch: > > Index: Modules/gcmodule.c >=================================================================== > --- Modules/gcmodule.c (revision 39238) > +++ Modules/gcmodule.c (revision 39239) > @@ -413,10 +413,8 @@ > assert(delstr != NULL); > return _PyInstance_Lookup(op, delstr) != NULL; > } > - else if (PyType_HasFeature(op->ob_type, Py_TPFLAGS_HEAPTYPE)) > + else > return op->ob_type->tp_del != NULL; > - else > - return 0; > } Right, that's the patch that taught gc that generators now have finalizers. The original code can be read in two ways: 1. As a whole, it was an implementation of: Only instances of old- or new-style classes can have finalizers. An instance of an old-style class has a finalizer iff it has a method named "__del__". An instance of a new-style class (PyType_HasFeature(op->ob_type, Py_TPFLAGS_HEAPTYPE) has a finalizer iff its tp_del is non-NULL. 2. Because of the obscure gimmicks that try to cater to using old binary extension modules across new Python releases without recompiling, there's no guarantee that the tp_del slot even exists, and therefore we don't try to access tp_del at all unless PyType_HasFeature says the type object was compiled recently enough so that we know tp_del does exist. When generators grew finalizers, the "Only instances of ... classes can have finalizers" part of #1 became wrong, and I expect that's why Phillip removed the conditional. It was the right thing to do from that point of view. I don't understand the #2 gimmicks well enough to guess whether it was actually safe to remove all guards before trying to access tp_del (I run on Windows, and compiled extensions can never be reused across Python minor releases on Windows -- if a problem was introduced here, I'll never see it). > Now, reverting that part of the patch in revision 39239 There may be a reason to change that patch (due to #2 above), but generators do have finalizers now and the #1 part must not be reverted (although it may be fruitful to complicate it ;-)). > triggers an assertion failure, but removing the assertion makes it work right; No, it's not right if has_finalizer(g) returns 0 for all generators g. > the above nested-generator cycle gets cleaned up again. Later in the trunk, > the > assertion was changed anyway, and it no longer fails if I just revert the > gcmodule change. Of course, the reason the cycle is uncleanable is because > generators grew a tp_del method, and that throws cycle-gc in a hissy fit; If by "hissy fit" you mean "gcmodule properly moves generators to the list of objects with finalizers, thereby preventing catastrophes", right, that's an intended hissy fit ;-) > reverting the above patch just makes cycle-gc ignore the tp_del of > heaptypes. I don't know enough about the cycle-gc to say whether that's good > or bad, Ignoring all generators' tp_del would be disastrous (opens pure-Python code to segfaults, etc). > but not having generators be cleanable is not very good. Not disastrous, though. > For what it's worth, reverting the gcmodule patch doesn't seem to break any > tests. I believe that. gc disasters are typically very difficult to provoke --, the first time :-) > However, fixing _both_ those things (itertools.tee lack of GC, and GC's > inability to clean generators) *still does not fix the 254 refleaks in > test_generators*. When I backport the itertools.tee-GC changes to r39238 > (one checkin before coroutines), test_generators doesn't leak, neither the > r39238 version of test_generators, nor the trunk version. One revision > later, r39239, either test_generators leaks 15 references. 'Fixing' > gcmodule, which fixes the nested-generator leak, does not change the number > of leaks. And, as you all may be aware of, in the trunk, test_generators > leaks 254 references; this is also the case with the gcmodule fix. So > there's more huntin' afoot. > > In the mean time, if people knowledgeable about the cycle-GC care to comment > about the gcmodule change wrt. heaptypes, please do. Did the best I could above, short of explaining exactly how failing to identify an object with a finalizer can lead to disaster. Short course: gc obviously needs to know which objects are and are not trash. If a finalizer is associated with an object in cyclic trash, then trash objects are potentially visible _from_ the finalizer, and therefore can be resurrected by running the finalizer. gc must therefore identify all trash objects reachable from trash objects with finalizers, because running finalizers _may_ make all such objects "not trash" again. Getting any part of that wrong can lead to calling tp_clear on an object that a finalizer actually resurrects, leading to symptoms from "hey, all the attributes on my object suddenly vanished by magic, for no reason at all" to segfaults. _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com