Re: [TIP] Anyone still using Python 2.5?
On Dec 21, 2011, at 2:15 AM, Chris Withers wrote: > Hi All, > > What's the general consensus on supporting Python 2.5 nowadays? > > Do people still have to use this in commercial environments or is everyone on > 2.6+ nowadays? For those of us living the nightmare of AppEngine *and* working on an app old enough to not be using the newer Datastore, we're stuck on 2.5 until we can justify a data migration. Frankly I don't know how many such apps exist these days that are still actively developed, though. > > I'm finally getting some continuous integration set up for my packages and > it's highlighting some 2.5 compatibility issues. I'm wondering whether to fix > those (lots of ugly "from __future__ import with_statement" everywhere) or > just to drop Python 2.5 support. > > What do people feel? > > cheers, > > Chris > > -- > Simplistix - Content Management, Batch Processing & Python Consulting >- http://www.simplistix.co.uk > > ___ > testing-in-python mailing list > testing-in-pyt...@lists.idyll.org > http://lists.idyll.org/listinfo/testing-in-python -- http://mail.python.org/mailman/listinfo/python-list
Re: Looping-related Memory Leak
On Jun 30, 8:24 pm, Carl Banks <[EMAIL PROTECTED]> wrote: > On Jun 30, 1:55 pm, Tom Davis <[EMAIL PROTECTED]> wrote: > > > > > On Jun 26, 5:38 am, Carl Banks <[EMAIL PROTECTED]> wrote: > > > > On Jun 26, 5:19 am, Tom Davis <[EMAIL PROTECTED]> wrote: > > > > > I am having a problem where a long-running function will cause a > > > > memory leak / balloon for reasons I cannot figure out. Essentially, I > > > > loop through a directory of pickled files, load them, and run some > > > > other functions on them. In every case, each function uses only local > > > > variables and I even made sure to use `del` on each variable at the > > > > end of the loop. However, as the loop progresses the amount of memory > > > > used steadily increases. > > > > Do you happen to be using a single Unpickler instance? If so, change > > > it to use a different instance each time. (If you just use the module- > > > level load function you are already using a different instance each > > > time.) > > > > Unpicklers hold a reference to everything they've seen, which prevents > > > objects it unpickles from being garbage collected until it is > > > collected itself. > > > > Carl Banks > > > Carl, > > > Yes, I was using the module-level unpickler. I changed it with little > > effect. I guess perhaps this is my misunderstanding of how GC works. > > For instance, if I have `a = Obj()` and run `a.some_method()` which > > generates a highly-nested local variable that cannot be easily garbage > > collected, it was my assumption that either (1) completing the method > > call or (2) deleting the object instance itself would automatically > > destroy any variables used by said method. This does not appear to be > > the case, however. Even when a variable/object's scope is destroyed, > > it would seem t hat variables/objects created within that scope cannot > > always be reclaimed, depending on their complexity. > > > To me, this seems illogical. I can understand that the GC is > > reluctant to reclaim objects that have many connections to other > > objects and so forth, but once those objects' scopes are gone, why > > doesn't it force a reclaim? > > Are your objects involved in circular references, and do you have any > objects with a __del__ method? Normally objects are reclaimed when > the reference count goes to zero, but if there are cycles then the > reference count never reaches zero, and they remain alive until the > generational garbage collector makes a pass to break the cycle. > However, the generational collector doesn't break cycles that involve > objects with a __del__method. There are some circular references, but these are produced by objects created by BeautifulSoup. I try to decompose all of them, but if there's one part of the code to blame it's almost certainly this. I have no objects with __del__ methods, at least none that I wrote. > Are you calling any C extensions that might be failing to decref an > object? There could be a memory leak. Perhaps. Yet another thing to look into. > Are you keeping a reference around somewhere. For example, appending > results to a list, and the result keeps a reference to all of your > unpickled data for some reason. No. > You know, we can throw out all these scenarios, but these suggestions > are just common pitfalls. If it doesn't look like one of these > things, you're going to have to do your own legwork to help isolate > what's causing the behavior. Then if needed you can come back to us > with more detailed information. > > Start with your original function, and slowly remove functionality > from it until the bad behavior goes away. That will give you a clue > what's causing it. I realize this and thank you folks for your patience. I thought perhaps there was something simple I was overlooking, but in this case it would seem that there are dozens of things outside of my direct control that could be causing this, most likely from third-party libraries I am using. I will continue to try to debug this on my own and see if I can figure anything out. Memory leaks and failing GC and so forth are all new concerns for me. Thanks Again, Tom -- http://mail.python.org/mailman/listinfo/python-list
Re: Looping-related Memory Leak
On Jun 30, 3:12 pm, Marc 'BlackJack' Rintsch <[EMAIL PROTECTED]> wrote: > On Mon, 30 Jun 2008 10:55:00 -0700, Tom Davis wrote: > > To me, this seems illogical. I can understand that the GC is > > reluctant to reclaim objects that have many connections to other > > objects and so forth, but once those objects' scopes are gone, why > > doesn't it force a reclaim? For instance, I can use timeit to create > > an object instance, run a method of it, then `del` the variable used > > to store the instance, but each loop thereafter continues to require > > more memory and take more time. 1000 runs may take .27 usec/pass > > whereas 10 takes 2 usec/pass (Average). > > `del` just removes the name and one reference to that object. Objects are > only deleted when there's no reference to them anymore. Your example > sounds like you keep references to objects somehow that are accumulating. > Maybe by accident. Any class level bound mutables or mutable default > values in functions in that source code? Would be my first guess. > > Ciao, > Marc 'BlackJack' Rintsch Marc, Thanks for the tips. A quick confirmation: I took "class level bound mutables" to mean something like: Class A(object): SOME_MUTABLE = [1,2] ... And "mutable default values" to mean: ... def a(self, arg=[1,2]): ... If this is correct, I have none of these. I understand your point about the references, but in my `timeit` example the statement is as simple as this: import MyClass a = MyClass() del a So, yes, it would seem that object references are piling up and not being removed. This is entirely by accident. Is there some kind of list somewhere that says "If your class has any of these attributes (mutable defaults, class-level mutables, etc.) it may not be properly dereferenced:"? My obvious hack around this is to only do X loops at a time and make a cron to run the script over and over until all the files have been processed, but I'd much prefer to make the code run as intended. I ran a test overnight last night and found that at first a few documents were handled per second, but when I woke up it had slowed down so much that it took over an hour to process a single document! The RAM usage went from 20mb at the start to over 300mb when it should actually never use more than about 20mb because everything is handled with local variables and new objects are instantiated for each document. This is a serious problem. Thanks, Tom -- http://mail.python.org/mailman/listinfo/python-list
Re: Looping-related Memory Leak
On Jun 26, 5:38 am, Carl Banks <[EMAIL PROTECTED]> wrote: > On Jun 26, 5:19 am, Tom Davis <[EMAIL PROTECTED]> wrote: > > > I am having a problem where a long-running function will cause a > > memory leak / balloon for reasons I cannot figure out. Essentially, I > > loop through a directory of pickled files, load them, and run some > > other functions on them. In every case, each function uses only local > > variables and I even made sure to use `del` on each variable at the > > end of the loop. However, as the loop progresses the amount of memory > > used steadily increases. > > Do you happen to be using a single Unpickler instance? If so, change > it to use a different instance each time. (If you just use the module- > level load function you are already using a different instance each > time.) > > Unpicklers hold a reference to everything they've seen, which prevents > objects it unpickles from being garbage collected until it is > collected itself. > > Carl Banks Carl, Yes, I was using the module-level unpickler. I changed it with little effect. I guess perhaps this is my misunderstanding of how GC works. For instance, if I have `a = Obj()` and run `a.some_method()` which generates a highly-nested local variable that cannot be easily garbage collected, it was my assumption that either (1) completing the method call or (2) deleting the object instance itself would automatically destroy any variables used by said method. This does not appear to be the case, however. Even when a variable/object's scope is destroyed, it would seem t hat variables/objects created within that scope cannot always be reclaimed, depending on their complexity. To me, this seems illogical. I can understand that the GC is reluctant to reclaim objects that have many connections to other objects and so forth, but once those objects' scopes are gone, why doesn't it force a reclaim? For instance, I can use timeit to create an object instance, run a method of it, then `del` the variable used to store the instance, but each loop thereafter continues to require more memory and take more time. 1000 runs may take .27 usec/pass whereas 10 takes 2 usec/pass (Average). -- http://mail.python.org/mailman/listinfo/python-list
Looping-related Memory Leak
I am having a problem where a long-running function will cause a memory leak / balloon for reasons I cannot figure out. Essentially, I loop through a directory of pickled files, load them, and run some other functions on them. In every case, each function uses only local variables and I even made sure to use `del` on each variable at the end of the loop. However, as the loop progresses the amount of memory used steadily increases. I had a related problem before where I would loop through a very large data-set of files and cache objects that were used to parse or otherwise operate on different files in the data-set. Once again, only local variables were used in the cached object's methods. After a while it got to the point where simply running these methods on the data took so long that I had to terminate the process (think, first iteration .01sec, 1000th iteration 10sec). The solution I found was to cause the cached objects to become "stale" after a certain number of uses and be deleted and re-instantiated. However, in the current case, there is no caching being done at all. Only local variables are involved. It would seem that over time objects take up more memory even when there are no attributes being added to them or altered. Has anyone experienced similar anomalies? Is this behavior to be expected for some other reason? If not, is there a common fix for it, i.e. manual GC or something? -- http://mail.python.org/mailman/listinfo/python-list