STINNER Victor <vstin...@python.org> added the comment:

Extract of Brandt's PR:

// The GC may have untracked this result tuple if its elements were all
// untracked. Since we're recycling it, make sure it's tracked again:
if (!_PyObject_GC_IS_TRACKED(result)) {
    _PyObject_GC_TRACK(result);
}

I would like to understand why the tuple is no longer tracked, whereas 
PyTuple_New() creates a newly created tuple which is tracked.

Using gdb, I found that gc_collect_main() calls untrack_tuples(young) which 
untracks all tuples of the young generation.

I understand that (when the issue happens):

* a zip() object is created with lz->result = (None, None)
* A GC collection happens
* The GC untracks (None, None) tuple
* next(zip) is called: lz->result has a reference count of 1 and so can be 
reused.

Problem: the tuple is no longer tracked, whereas its content changed and so the 
newly filled tuple might be part of a reference cycle. Since the tuple is not 
tracked, the GC can no longer break the reference cycle involving the zip 
object internal tuple.

Example of code where the zip tuple is untracked before zip_next() is called on 
the zip object:

    def test_product(self):
        gc.set_threshold(5)
        pools = [(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12),
                 ('a', 'b', 'c'),
                 (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12),
                 (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)]
        indices = [0, 2, 10, 11]
        print(indices)
        print(pools)
        list(None for a, b in zip(pools, indices))

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue42536>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to