On Jun 30, 8:24 pm, Carl Banks <[EMAIL PROTECTED]> wrote: > On Jun 30, 1:55 pm, Tom Davis <[EMAIL PROTECTED]> wrote: > > > > > On Jun 26, 5:38 am, Carl Banks <[EMAIL PROTECTED]> wrote: > > > > On Jun 26, 5:19 am, Tom Davis <[EMAIL PROTECTED]> wrote: > > > > > I am having a problem where a long-running function will cause a > > > > memory leak / balloon for reasons I cannot figure out. Essentially, I > > > > loop through a directory of pickled files, load them, and run some > > > > other functions on them. In every case, each function uses only local > > > > variables and I even made sure to use `del` on each variable at the > > > > end of the loop. However, as the loop progresses the amount of memory > > > > used steadily increases. > > > > Do you happen to be using a single Unpickler instance? If so, change > > > it to use a different instance each time. (If you just use the module- > > > level load function you are already using a different instance each > > > time.) > > > > Unpicklers hold a reference to everything they've seen, which prevents > > > objects it unpickles from being garbage collected until it is > > > collected itself. > > > > Carl Banks > > > Carl, > > > Yes, I was using the module-level unpickler. I changed it with little > > effect. I guess perhaps this is my misunderstanding of how GC works. > > For instance, if I have `a = Obj()` and run `a.some_method()` which > > generates a highly-nested local variable that cannot be easily garbage > > collected, it was my assumption that either (1) completing the method > > call or (2) deleting the object instance itself would automatically > > destroy any variables used by said method. This does not appear to be > > the case, however. Even when a variable/object's scope is destroyed, > > it would seem t hat variables/objects created within that scope cannot > > always be reclaimed, depending on their complexity. > > > To me, this seems illogical. I can understand that the GC is > > reluctant to reclaim objects that have many connections to other > > objects and so forth, but once those objects' scopes are gone, why > > doesn't it force a reclaim? > > Are your objects involved in circular references, and do you have any > objects with a __del__ method? Normally objects are reclaimed when > the reference count goes to zero, but if there are cycles then the > reference count never reaches zero, and they remain alive until the > generational garbage collector makes a pass to break the cycle. > However, the generational collector doesn't break cycles that involve > objects with a __del__method.
There are some circular references, but these are produced by objects created by BeautifulSoup. I try to decompose all of them, but if there's one part of the code to blame it's almost certainly this. I have no objects with __del__ methods, at least none that I wrote. > Are you calling any C extensions that might be failing to decref an > object? There could be a memory leak. Perhaps. Yet another thing to look into. > Are you keeping a reference around somewhere. For example, appending > results to a list, and the result keeps a reference to all of your > unpickled data for some reason. No. > You know, we can throw out all these scenarios, but these suggestions > are just common pitfalls. If it doesn't look like one of these > things, you're going to have to do your own legwork to help isolate > what's causing the behavior. Then if needed you can come back to us > with more detailed information. > > Start with your original function, and slowly remove functionality > from it until the bad behavior goes away. That will give you a clue > what's causing it. I realize this and thank you folks for your patience. I thought perhaps there was something simple I was overlooking, but in this case it would seem that there are dozens of things outside of my direct control that could be causing this, most likely from third-party libraries I am using. I will continue to try to debug this on my own and see if I can figure anything out. Memory leaks and failing GC and so forth are all new concerns for me. Thanks Again, Tom -- http://mail.python.org/mailman/listinfo/python-list