Neil Schemenauer <nas-pyt...@arctrix.com> added the comment:

A few comments from the "peanut gallery".  Thanks to Victor and Pablo for doing 
the hard work of investigating this bug.

First, it has been a long time since Tim and I first developed gcmodule.c.  So, 
some of my understanding may be inaccurate due to code changes or old age 
memory loss. ;-P

If I understand Victor's test case correctly, the problem is caused if you have 
an extension type that implements tp_traverse but not tp_clear and that there 
is also a weakref involved in the trash cycle.  In my original design of the 
GC, not implementing tp_clear should have been okay.  It would mean that the GC 
might find garbage reference cycles that it couldn't cleanup.  Those would leak 
memory but would not crash Python.

I'm not sure what has changed since to require that tp_clear actually 
successfully clears the cycle.  Tim was the origin of the code that handles 
weakrefs. The GC is (relatively) simple if not for handling weakrefs and 
finalizers.  Tim did almost all of the difficult and brain exploding work on 
that stuff.  There was a big re-org some years ago to handle "legacy 
finalizers" and PEP 442 finalizers.  That made things more complicated yet.  
Maybe the requirement for a working tp_clear came into existence then?  I added 
Tim to the nosy list since he might have insight.

To me, it seems problematic that we would require a type to have a tp_clear 
method that actually breaks cycles.  For mutable objects like dicts and lists, 
we can have tp_clear do its thing while leaving the object in a valid state.  
The tp_clear added for functions was not like that because other code expected 
structure fields to be non-NULL.  At least that's my understanding.

Is the behavior of tp_clear the key to this bug?  In the original design GC, 
calling tp_clear was not guaranteed to cause objects in the cylce to be freed.  
So, after tp_clear, the object still needs to be in some kind of valid state.  
That is one of the reasons why the tuple type doesn't have a tp_clear (also 
that they can't normally be directly be used to create cycles).

One last observation: the reference cycle GC is now being relied upon in ways 
that were never originally envisioned.  Originally the cyclic GC was an 
optional feature and you could build Python without it.  Reference cycles were 
supposed to exist in rare cases and only be created by user code.  In that 
world, having tp_clear on the function type is be unneeded.  Now, CPython is 
creating reference cycles itself and the GC is being relied on much more to 
free memory.  I'm afraid we are evolving into a complicated and low performance 
GC (refcount + cycle collection).  I don't have any good suggestions on a 
better approach.  Maintaining the C-API ties us to reference counting.  I 
mention it because it might be helpful to have a high-level view of the 
evolution of this part of CPython.

----------
nosy: +nascheme, tim.peters

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue38006>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to