Bugs item #1208304, was opened at 2005-05-25 09:20 Message generated for change (Comment added) made by jafo You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1208304&group_id=5470
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Extension Modules Group: Python 2.4 Status: Open Resolution: None Priority: 5 Submitted By: Petr Toman (manekcz) Assigned to: Nobody/Anonymous (nobody) Summary: urllib2's urlopen() method causes a memory leak Initial Comment: It seems that the urlopen(url) methd of the urllib2 module leaves some undestroyable objects in memory. Please try the following code: ========================== if __name__ == '__main__': import urllib2 a = urllib2.urlopen('http://www.google.com') del a # or a = None or del(a) # check memory on memory leaks import gc gc.set_debug(gc.DEBUG_SAVEALL) gc.collect() for it in gc.garbage: print it ========================== In our code, we're using lots of urlopens in a loop and the number of unreachable objects grows beyond all limits :) We also tried a.close() but it didn't help. You can also try the following: ========================== def print_unreachable_len(): # check memory on memory leaks import gc gc.set_debug(gc.DEBUG_SAVEALL) gc.collect() unreachableL = [] for it in gc.garbage: unreachableL.append(it) return len(str(unreachableL)) if __name__ == '__main__': print "at the beginning", print_unreachable_len() import urllib2 print "after import of urllib2", print_unreachable_len() a = urllib2.urlopen('http://www.google.com') print 'after urllib2.urlopen', print_unreachable_len() del a print 'after del', print_unreachable_len() ========================== We're using WindowsXP with latest patches, Python 2.4 (ActivePython 2.4 Build 243 (ActiveState Corp.) based on Python 2.4 (#60, Nov 30 2004, 09:34:21) [MSC v.1310 32 bit (Intel)] on win32). ---------------------------------------------------------------------- >Comment By: Sean Reifschneider (jafo) Date: 2005-06-29 03:52 Message: Logged In: YES user_id=81797 I give up, this code is kind of a maze of twisty little passages. I did try doing "a.fp.close()" and that didn't seem to help at all. Couldn't really make any progress on that though. I also tried doing a "if a.headers.fp: a.headers.fp.close()", which didn't do anything noticable. ---------------------------------------------------------------------- Comment By: Sean Reifschneider (jafo) Date: 2005-06-29 03:27 Message: Logged In: YES user_id=81797 I can reproduce this in both the python.org 2.4 RPM and in a freshly built copy from CVS. If I run a few thousand urlopen()s, I get: Traceback (most recent call last): File "/tmp/mt", line 26, in ? File "/tmp/python/dist/src/Lib/urllib2.py", line 130, in urlopen File "/tmp/python/dist/src/Lib/urllib2.py", line 361, in open File "/tmp/python/dist/src/Lib/urllib2.py", line 379, in _open File "/tmp/python/dist/src/Lib/urllib2.py", line 340, in _call_chain File "/tmp/python/dist/src/Lib/urllib2.py", line 1026, in http_open File "/tmp/python/dist/src/Lib/urllib2.py", line 1001, in do_open urllib2.URLError: <urlopen error (24, 'Too many open files')> Even if I do a a.close(). I'll investigate a bit further. Sean ---------------------------------------------------------------------- Comment By: A.M. Kuchling (akuchling) Date: 2005-06-01 23:13 Message: Logged In: YES user_id=11375 Confirmed. The objects involved seem to be an HTTPResponse and the socket._fileobject wrapper; the assignment 'r.recv=r.read' around line 1013 of urllib2.py seems to be critical to creating the cycle. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1208304&group_id=5470 _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com