of destructors, open files and garbage collection

2007-05-24 Thread massimo s.
Hi,

Python 2.4, Kubuntu 6.06. I'm no professional programmer (I am a ph.d.
student in biophysics) but I have a fair knowledge of Python.

I have a for loop that looks like the following :

for item in long_list:
   foo(item)

def foo(item):
   item.create_blah() #<--this creates item.blah; by doing that it
opens a file and leaves it open until blah.__del__() is called

Now, what I thought is that if I call

del(item)

it will delete item and also all objects created inside item. So I
thought that item.blah.__del__() would have been called and files
closed.
Question 1:
This is not the case. I have to call del(item.blah), otherwise files
are kept open and the for loops end with a "Too many open files"
error. Why isn't __del__() called on objects belonging to a parent
object? Is it OK?

So I thought:
oh, ok, let's put del(self.blah) in item.__del__()
Question 2:
This doesn't work either. Why?

Thanks a lot,
M.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: of destructors, open files and garbage collection

2007-05-24 Thread Paul Moore
On 24 May, 16:40, "massimo s." <[EMAIL PROTECTED]> wrote:
> Now, what I thought is that if I call
>
> del(item)
>
> it will delete item and also all objects created inside item.

Sort of, but it's a bit more subtle. You'll stop the name "item" from
referring to your item - if nothing else refers to your item, it will
be garbage collected (and __del__ will get called). But you can have
other references, and in this case, __del__ is not called until *they*
are released as well.

Here's an example:

>>> class C:
... def __del__(self):
... print "del called"
...
>>> c = C()
>>> # Now we have one reference to the object, in c. So delete it:
...
>>> del c
del called
>>> # Just as we want.
... # Let's create a new C, but store a second reference to it in "a".
...
>>> c = C()
>>> a = c
>>> # Now we can delete c, but a still refers to the object, so it isn't 
>>> collected
...
>>> del c
>>> # But if we delete a, it is!
...
>>> del a
del called
>>>

OK. Now in your case, it's a bit more complex. You delete item.
Suppose that causes the item to be garbage collected (there are no
other references). Then, the item will be collected. This removes the
attribute item.blah, which refers to the blah object. So the blah
object is collected - *as long as no other references exist to that
item*. Here's another example:

>>> class B:
... def __init__(self):
... self.c = C()
... def __del__(self):
... print "B's delete called"
...
>>> b = B()
>>> del b
B's delete called
del called
>>> # But if we have a second reference to b.c, that causes the object to stay 
>>> alive:
...
>>> b = B()
>>> a = b.c
>>> del b
B's delete called
>>> del a
del called
>>>

See? Even though b was collected, its c attribute is still accessible
under the name 'a', so it's kept alive.

> Question 1:
> This is not the case. I have to call del(item.blah), otherwise files
> are kept open and the for loops end with a "Too many open files"
> error. Why isn't __del__() called on objects belonging to a parent
> object? Is it OK?

Did the above help to clarify?

> So I thought:
> oh, ok, let's put del(self.blah) in item.__del__()
> Question 2:
> This doesn't work either. Why?

It's not needed - it's not the item.blah reference that's keeping the
blah object alive, it's another one.

You *can* fix this by tracking down all the references and explicitly
deleting them one by one, but that's not really the best answer.
You're micromanaging stuff the garbage collector is supposed to handle
for you. Ultimately, you've got a design problem, as you're holding
onto stuff you no longer need. Whether you use del, or add an explicit
blah.close() method to close the filehandle, you've got to understand
when you're finished with a filehandle - if you know that, you can
close it at that point.

Here's a final example that may help:

>>> a = []
>>> for i in range(10):
... a.append(C())
...
>>> # Lots of work, none of which uses a
...
>>> a = [] # or del a
del called
del called
del called
del called
del called
del called
del called
del called
del called
del called

See how you finished with all of the C objects right after the for
loop, but they didn't get deleted until later? I suspect that's what's
happening to you. If you cleared out the list (my a = [] statement) as
soon as you're done with it, you get the resources back that much
sooner.

Hope this helps,
Paul.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: of destructors, open files and garbage collection

2007-05-24 Thread Marc 'BlackJack' Rintsch
In <[EMAIL PROTECTED]>, massimo s.
wrote:

> I have a for loop that looks like the following :
> 
> for item in long_list:
>foo(item)
> 
> def foo(item):
>item.create_blah() #<--this creates item.blah; by doing that it
> opens a file and leaves it open until blah.__del__() is called
> 
> Now, what I thought is that if I call
> 
> del(item)
> 
> it will delete item and also all objects created inside item.

It will delete the *name* `item`.  It does nothing to the object that was
bound to that name.  If the name was the only reference to that object, it
may be garbage collected sooner or later.  Read the documentation for the
`__del__()` method for more details and why implementing such a method
increases the chance that the object *won't* be garbage collected!

Relying on the `__del__()` method isn't a good idea because there are no
really hard guaranties by the language if and when it will be called.

Ciao,
Marc 'BlackJack' Rintsch
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: of destructors, open files and garbage collection

2007-05-24 Thread massimo s.
> It will delete the *name* `item`.  It does nothing to the object that was
> bound to that name.  If the name was the only reference to that object, it
> may be garbage collected sooner or later.  Read the documentation for the
> `__del__()` method for more details and why implementing such a method
> increases the chance that the object *won't* be garbage collected!
>
> Relying on the `__del__()` method isn't a good idea because there are no
> really hard guaranties by the language if and when it will be called.

Ok, I gave a look at the docs and, in fact, relying on __del__ doesn't
look like a good idea.

Changing the code as to add an explicit method that closes dangling
filehandles is easy. It would be somehow nice because -since that
method would be added to a plugin API- it *forces* people writing
plugins to ensure a way to close their dangling files, and this may be
useful for a lot of future purposes. However I'd also like to track
references to my objects -this would help debugging a lot. How can I
do that?

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: of destructors, open files and garbage collection

2007-05-24 Thread massimo s.
> Relying on the `__del__()` method isn't a good idea because there are no
> really hard guaranties by the language if and when it will be called.

Ok, I read the __del__() docs and I understand using it is not a good
idea.

I can easily add a close_files() method that forces all dangling files
to be closed. It would be useful in a number of other possible
situations. However, as rightly pointed out by the exhaustive answer
of Paul Moore, tracking references of my objects would be very useful.
How can I do that?

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: of destructors, open files and garbage collection

2007-05-24 Thread Terry Reedy

"massimo s." <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
| Hi,
|
| Python 2.4, Kubuntu 6.06. I'm no professional programmer (I am a ph.d.
| student in biophysics) but I have a fair knowledge of Python.
|
| I have a for loop that looks like the following :
|
| for item in long_list:
|   foo(item)
|
| def foo(item):
|   item.create_blah() #<--this creates item.blah; by doing that it
| opens a file and leaves it open until blah.__del__() is called
|
| Now, what I thought is that if I call
|
| del(item)
|
| it will delete item

No, it removes the association between the name 'item' and the object it is 
currently bound to.  In CPython, removing the last such reference will 
cause the object to be gc'ed.  In other implementations, actual deletion 
may occur later.  You probably should close the files directly and arrange 
code so that you can do so before too many are open.

tjr



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: of destructors, open files and garbage collection

2007-05-26 Thread massimo s.
> No, it removes the association between the name 'item' and the object it is
> currently bound to.  In CPython, removing the last such reference will
> cause the object to be gc'ed.  In other implementations, actual deletion
> may occur later.  You probably should close the files directly and arrange
> code so that you can do so before too many are open.

Thanks a lot, I'll follow that way.

m.

-- 
http://mail.python.org/mailman/listinfo/python-list