of destructors, open files and garbage collection
Hi, Python 2.4, Kubuntu 6.06. I'm no professional programmer (I am a ph.d. student in biophysics) but I have a fair knowledge of Python. I have a for loop that looks like the following : for item in long_list: foo(item) def foo(item): item.create_blah() #<--this creates item.blah; by doing that it opens a file and leaves it open until blah.__del__() is called Now, what I thought is that if I call del(item) it will delete item and also all objects created inside item. So I thought that item.blah.__del__() would have been called and files closed. Question 1: This is not the case. I have to call del(item.blah), otherwise files are kept open and the for loops end with a "Too many open files" error. Why isn't __del__() called on objects belonging to a parent object? Is it OK? So I thought: oh, ok, let's put del(self.blah) in item.__del__() Question 2: This doesn't work either. Why? Thanks a lot, M. -- http://mail.python.org/mailman/listinfo/python-list
Re: of destructors, open files and garbage collection
On 24 May, 16:40, "massimo s." <[EMAIL PROTECTED]> wrote: > Now, what I thought is that if I call > > del(item) > > it will delete item and also all objects created inside item. Sort of, but it's a bit more subtle. You'll stop the name "item" from referring to your item - if nothing else refers to your item, it will be garbage collected (and __del__ will get called). But you can have other references, and in this case, __del__ is not called until *they* are released as well. Here's an example: >>> class C: ... def __del__(self): ... print "del called" ... >>> c = C() >>> # Now we have one reference to the object, in c. So delete it: ... >>> del c del called >>> # Just as we want. ... # Let's create a new C, but store a second reference to it in "a". ... >>> c = C() >>> a = c >>> # Now we can delete c, but a still refers to the object, so it isn't >>> collected ... >>> del c >>> # But if we delete a, it is! ... >>> del a del called >>> OK. Now in your case, it's a bit more complex. You delete item. Suppose that causes the item to be garbage collected (there are no other references). Then, the item will be collected. This removes the attribute item.blah, which refers to the blah object. So the blah object is collected - *as long as no other references exist to that item*. Here's another example: >>> class B: ... def __init__(self): ... self.c = C() ... def __del__(self): ... print "B's delete called" ... >>> b = B() >>> del b B's delete called del called >>> # But if we have a second reference to b.c, that causes the object to stay >>> alive: ... >>> b = B() >>> a = b.c >>> del b B's delete called >>> del a del called >>> See? Even though b was collected, its c attribute is still accessible under the name 'a', so it's kept alive. > Question 1: > This is not the case. I have to call del(item.blah), otherwise files > are kept open and the for loops end with a "Too many open files" > error. Why isn't __del__() called on objects belonging to a parent > object? Is it OK? Did the above help to clarify? > So I thought: > oh, ok, let's put del(self.blah) in item.__del__() > Question 2: > This doesn't work either. Why? It's not needed - it's not the item.blah reference that's keeping the blah object alive, it's another one. You *can* fix this by tracking down all the references and explicitly deleting them one by one, but that's not really the best answer. You're micromanaging stuff the garbage collector is supposed to handle for you. Ultimately, you've got a design problem, as you're holding onto stuff you no longer need. Whether you use del, or add an explicit blah.close() method to close the filehandle, you've got to understand when you're finished with a filehandle - if you know that, you can close it at that point. Here's a final example that may help: >>> a = [] >>> for i in range(10): ... a.append(C()) ... >>> # Lots of work, none of which uses a ... >>> a = [] # or del a del called del called del called del called del called del called del called del called del called del called See how you finished with all of the C objects right after the for loop, but they didn't get deleted until later? I suspect that's what's happening to you. If you cleared out the list (my a = [] statement) as soon as you're done with it, you get the resources back that much sooner. Hope this helps, Paul. -- http://mail.python.org/mailman/listinfo/python-list
Re: of destructors, open files and garbage collection
In <[EMAIL PROTECTED]>, massimo s. wrote: > I have a for loop that looks like the following : > > for item in long_list: >foo(item) > > def foo(item): >item.create_blah() #<--this creates item.blah; by doing that it > opens a file and leaves it open until blah.__del__() is called > > Now, what I thought is that if I call > > del(item) > > it will delete item and also all objects created inside item. It will delete the *name* `item`. It does nothing to the object that was bound to that name. If the name was the only reference to that object, it may be garbage collected sooner or later. Read the documentation for the `__del__()` method for more details and why implementing such a method increases the chance that the object *won't* be garbage collected! Relying on the `__del__()` method isn't a good idea because there are no really hard guaranties by the language if and when it will be called. Ciao, Marc 'BlackJack' Rintsch -- http://mail.python.org/mailman/listinfo/python-list
Re: of destructors, open files and garbage collection
> It will delete the *name* `item`. It does nothing to the object that was > bound to that name. If the name was the only reference to that object, it > may be garbage collected sooner or later. Read the documentation for the > `__del__()` method for more details and why implementing such a method > increases the chance that the object *won't* be garbage collected! > > Relying on the `__del__()` method isn't a good idea because there are no > really hard guaranties by the language if and when it will be called. Ok, I gave a look at the docs and, in fact, relying on __del__ doesn't look like a good idea. Changing the code as to add an explicit method that closes dangling filehandles is easy. It would be somehow nice because -since that method would be added to a plugin API- it *forces* people writing plugins to ensure a way to close their dangling files, and this may be useful for a lot of future purposes. However I'd also like to track references to my objects -this would help debugging a lot. How can I do that? -- http://mail.python.org/mailman/listinfo/python-list
Re: of destructors, open files and garbage collection
> Relying on the `__del__()` method isn't a good idea because there are no > really hard guaranties by the language if and when it will be called. Ok, I read the __del__() docs and I understand using it is not a good idea. I can easily add a close_files() method that forces all dangling files to be closed. It would be useful in a number of other possible situations. However, as rightly pointed out by the exhaustive answer of Paul Moore, tracking references of my objects would be very useful. How can I do that? -- http://mail.python.org/mailman/listinfo/python-list
Re: of destructors, open files and garbage collection
"massimo s." <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] | Hi, | | Python 2.4, Kubuntu 6.06. I'm no professional programmer (I am a ph.d. | student in biophysics) but I have a fair knowledge of Python. | | I have a for loop that looks like the following : | | for item in long_list: | foo(item) | | def foo(item): | item.create_blah() #<--this creates item.blah; by doing that it | opens a file and leaves it open until blah.__del__() is called | | Now, what I thought is that if I call | | del(item) | | it will delete item No, it removes the association between the name 'item' and the object it is currently bound to. In CPython, removing the last such reference will cause the object to be gc'ed. In other implementations, actual deletion may occur later. You probably should close the files directly and arrange code so that you can do so before too many are open. tjr -- http://mail.python.org/mailman/listinfo/python-list
Re: of destructors, open files and garbage collection
> No, it removes the association between the name 'item' and the object it is > currently bound to. In CPython, removing the last such reference will > cause the object to be gc'ed. In other implementations, actual deletion > may occur later. You probably should close the files directly and arrange > code so that you can do so before too many are open. Thanks a lot, I'll follow that way. m. -- http://mail.python.org/mailman/listinfo/python-list