Re: [Python-Dev] [Python-checkins] r43358 - python/trunk/Modules/itertoolsmodule.c

2006-03-28 Thread Thomas Wouters
On 3/27/06, Neal Norwitz [EMAIL PROTECTED] wrote:
On 3/27/06, Thomas Wouters [EMAIL PROTECTED] wrote: The teeobject has GC (hence the word 'and' in 'itertools.tee and its internal teedataobject' ;-) The problem with test_generators is that this
 also leaks: def leak(): def gen(): while True: yield g g = gen() leak()Please add a test for this to the leakers and remove the tee test in
leakers if that no longer leaks.I added this test, but I didn't remove test_tee, because it still leaks. I also checked in a test of code that used to leak, but doesn't anymore. The cycle this nested generator creates, which is also involved in the test_tee leak, is not cleanable by the cycle-gc, and it looks like it hasn't been since the yield-expr/coroutine patch was included in the trunk. I believe the culprit to be this part of that patch:
Index: Modules/gcmodule.c===--- Modules/gcmodule.c (revision 39238)+++ Modules/gcmodule.c (revision 39239)@@ -413,10 +413,8 @@
 assert(delstr != NULL); return _PyInstance_Lookup(op, delstr) != NULL; }- else if (PyType_HasFeature(op-ob_type, Py_TPFLAGS_HEAPTYPE))+ else return op-ob_type-tp_del != NULL;
- else- return 0;}/* Move the objects in unreachable with __del__ methods into `finalizers`.Now, reverting that part of the patch in revision 39239 triggers an assertion failure, but removing the assertion makes it work right; the above nested-generator cycle gets cleaned up again. Later in the trunk, the assertion was changed anyway, and it no longer fails if I just revert the gcmodule change. Of course, the reason the cycle is uncleanable is because generators grew a tp_del method, and that throws cycle-gc in a hissy fit; reverting the above patch just makes cycle-gc ignore the tp_del of heaptypes. I don't know enough about the cycle-gc to say whether that's good or bad, but not having generators be cleanable is not very good. For what it's worth, reverting the gcmodule patch doesn't seem to break any tests.
However, fixing _both_ those things (itertools.tee lack of GC, and GC's inability to clean generators) *still does not fix the 254 refleaks in test_generators*. When I backport the itertools.tee-GC changes to r39238 (one checkin before coroutines), test_generators doesn't leak, neither the r39238 version of test_generators, nor the trunk version. One revision later, r39239, either test_generators leaks 15 references. 'Fixing' gcmodule, which fixes the nested-generator leak, does not change the number of leaks. And, as you all may be aware of, in the trunk, test_generators leaks 254 references; this is also the case with the gcmodule fix. So there's more huntin' afoot.
In the mean time, if people knowledgeable about the cycle-GC care to comment about the gcmodule change wrt. heaptypes, please do.-- Thomas Wouters [EMAIL PROTECTED]
Hi! I'm a .signature virus! copy me into your .signature file to help me spread!
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] r43358 - python/trunk/Modules/itertoolsmodule.c

2006-03-28 Thread Tim Peters
[Thomas Wouters]
 ...
 The cycle this nested generator creates, which is also involved in the 
 test_tee
 leak, is not cleanable by the cycle-gc, and it looks like it hasn't been
 since the yield-expr/coroutine patch was included in the trunk.

That could very well be.  Adding finalizers to generators could
legitimately make some cycles suddenly uncollectable.  There was a
burst of list traffic about this on the 18th and 19th of June, 2005,
under subject:

gcmodule issue w/adding __del__ to generator objects

I starred it in gmail at the time (which is why I was able to find it
again ;-)), but never had time to pay attention.

 I believe the culprit to be this part of that patch:

 Index: Modules/gcmodule.c
===
 --- Modules/gcmodule.c  (revision 39238)
 +++ Modules/gcmodule.c  (revision 39239)
 @@ -413,10 +413,8 @@
 assert(delstr != NULL);
 return _PyInstance_Lookup(op, delstr) != NULL;
 }
 -   else if (PyType_HasFeature(op-ob_type, Py_TPFLAGS_HEAPTYPE))
 +   else
 return op-ob_type-tp_del != NULL;
 -   else
 -   return 0;
  }

Right, that's the patch that taught gc that generators now have finalizers.

The original code can be read in two ways:

1. As a whole, it was an implementation of:

   Only instances of old- or new-style classes can have finalizers.
   An instance of an old-style class has a finalizer iff
   it has a method named __del__.
   An instance of a new-style class (PyType_HasFeature(op-ob_type,
   Py_TPFLAGS_HEAPTYPE) has a finalizer iff its tp_del is non-NULL.

2. Because of the obscure gimmicks that try to cater to using old
binary extension modules across new Python releases without
recompiling, there's no guarantee that the tp_del slot even exists,
and therefore we don't try to access tp_del at all unless
PyType_HasFeature says the type object was compiled recently
enough so that we know tp_del does exist.

When generators grew finalizers, the Only instances of ... classes
can have finalizers part of #1 became wrong, and I expect that's why
Phillip removed the conditional.  It was the right thing to do from
that point of view.

I don't understand the #2 gimmicks well enough to guess whether it was
actually safe to remove all guards before trying to access tp_del (I
run on Windows, and compiled extensions can never be reused across
Python minor releases on Windows -- if a problem was introduced here,
I'll never see it).

 Now, reverting that part of the patch in revision 39239

There may be a reason to change that patch (due to #2 above), but
generators do have finalizers now and the #1 part must not be reverted
(although it may be fruitful to complicate it ;-)).

 triggers an assertion failure, but removing the assertion makes it work right;

No, it's not right if has_finalizer(g) returns 0 for all generators g.

 the above nested-generator cycle gets cleaned up again. Later in the trunk, 
 the
 assertion was changed anyway, and it no longer fails if I just revert the
 gcmodule change. Of course, the reason the cycle is uncleanable is because
 generators grew a tp_del method, and that throws cycle-gc in a hissy fit;

If by hissy fit you mean gcmodule properly moves generators to the
list of objects with finalizers, thereby preventing catastrophes,
right, that's an intended hissy fit ;-)

 reverting the above patch just makes cycle-gc ignore the tp_del of
 heaptypes. I don't know enough about the cycle-gc to say whether that's good
 or bad,

Ignoring all generators' tp_del would be disastrous (opens pure-Python
code to segfaults, etc).

 but not having generators be cleanable is not very good.

Not disastrous, though.

 For what it's worth, reverting the gcmodule patch doesn't seem to break any
 tests.

I believe that.  gc disasters are typically very difficult to provoke
--, the first time :-)

 However, fixing _both_ those things (itertools.tee lack of GC, and GC's
 inability to clean generators) *still does not fix the 254 refleaks in
 test_generators*. When I backport the itertools.tee-GC changes to r39238
 (one checkin before coroutines), test_generators doesn't leak, neither the
 r39238 version of test_generators, nor the trunk version. One revision
 later, r39239, either test_generators leaks 15 references. 'Fixing'
 gcmodule, which fixes the nested-generator leak, does not change the number
 of leaks. And, as you all may be aware of, in the trunk, test_generators
 leaks 254 references; this is also the case with the gcmodule fix. So
 there's more huntin' afoot.

 In the mean time, if people knowledgeable about the cycle-GC care to comment
 about the gcmodule change wrt. heaptypes, please do.

Did the best I could above, short of explaining exactly how failing to
identify an object with a finalizer can lead to disaster.  Short
course:  gc obviously needs to know which objects are and are not

Re: [Python-Dev] [Python-checkins] r43358 - python/trunk/Modules/itertoolsmodule.c

2006-03-28 Thread Phillip J. Eby
At 06:13 PM 3/28/2006 -0500, Tim Peters wrote:
2. Because of the obscure gimmicks that try to cater to using old
 binary extension modules across new Python releases without
 recompiling, there's no guarantee that the tp_del slot even exists,
 and therefore we don't try to access tp_del at all unless
 PyType_HasFeature says the type object was compiled recently
 enough so that we know tp_del does exist.

When generators grew finalizers, the Only instances of ... classes
can have finalizers part of #1 became wrong, and I expect that's why
Phillip removed the conditional.  It was the right thing to do from
that point of view.

I don't understand the #2 gimmicks well enough to guess whether it was
actually safe to remove all guards before trying to access tp_del (I
run on Windows, and compiled extensions can never be reused across
Python minor releases on Windows -- if a problem was introduced here,
I'll never see it).

By that reasoning, binary compatibility won't be an issue anywhere else, 
either, since the change was made on the 2.5 alpha trunk, and ISTM that 2.5 
will require recompiling extensions anyway.

Now, the trick could potentially be made a bit smarter if there were a slot 
by which gcmodule could ask the object for a finalizer *dynamically*.  A 
generator without an active frame (or an active frame with no try blocks 
on the frame's block stack), has no need to run Python code for 
finalization.  Calling tp_clear on such a generator will do anything that 
the actual deletion would, only faster.  :)

So, that approach could be used to get rid of most new leaks caused by 
generator cycles, at the expense of a bit more complexity in the gc and in 
generators.  It won't fix leaks caused by cycles in active generators that 
have active try/finally or try/except blocks, since these are the things 
that actually need finalizing.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] r43358 - python/trunk/Modules/itertoolsmodule.c

2006-03-28 Thread Thomas Wouters
On 3/29/06, Thomas Wouters [EMAIL PROTECTED] wrote:
So does that make all cycles involving only objects with finalizers impervious to cycle-gc? I guess it'd have to be that way.Er, I meant to say 'does that make all cycles involving just one object with a finalizer impervious to cycle-gc'.
-- Thomas Wouters [EMAIL PROTECTED]Hi! I'm a .signature virus! copy me into your .signature file to help me spread!
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] r43358 - python/trunk/Modules/itertoolsmodule.c

2006-03-28 Thread Tim Peters
[Thomas Wouters]
 So does that make all cycles involving only objects with finalizers
 impervious to cycle-gc? I guess it'd have to be that way.

[again]
 Er, I meant to say 'does that make all cycles involving just one object with
 a finalizer impervious to cycle-gc'.

Both are true, and both are implied by this sharper variant:

all cycles containing at least one object with a finalizer [etc]
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] r43358 - python/trunk/Modules/itertoolsmodule.c

2006-03-27 Thread Thomas Wouters
On 3/27/06, Raymond Hettinger [EMAIL PROTECTED] wrote:
 Modified: python/trunk/Modules/itertoolsmodule.c Log: Make itertools.tee and its internal teedataobject participate in GC. This alone does not solve the leak in test_generators, unfortunately, but it is
 part of test_generators' problem and it does solve other cycles.Thanks for getting this in.To get the leak in test_generators, I think you make need to add GC to theteeobject as well as the teedataobject.
The teeobject has GC (hence the word 'and' in 'itertools.tee and its internal teedataobject' ;-) The problem with test_generators is that this also leaks:def leak(): def gen(): while True:
 yield g g = gen()leak()It doesn't leak in 2.4. I'm using a little shell script to work backwards through the 2.5 changes to find out which one introduced this, although I somehow suspect it's the coroutine stuff... ;P
-- Thomas Wouters [EMAIL PROTECTED]Hi! I'm a .signature virus! copy me into your .signature file to help me spread!
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] r43358 - python/trunk/Modules/itertoolsmodule.c

2006-03-27 Thread Neal Norwitz
On 3/27/06, Thomas Wouters [EMAIL PROTECTED] wrote:

 The teeobject has GC (hence the word 'and' in 'itertools.tee and its
 internal teedataobject' ;-) The problem with test_generators is that this
 also leaks:

 def leak():
 def gen():
 while True:
 yield g
 g = gen()

 leak()

Please add a test for this to the leakers and remove the tee test in
leakers if that no longer leaks.

Thanks,
n
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com