Hi list, Firstly, this is my first post here, so I hope I'm not breaking some unwritten etiquette rule about asking questions involving several different libraries.
I'm trying to plug some memory leaks in a TurboGears program. We (the Fedora Project) have a few apps in Turbogears in infrastructure that all seem to be running into the same issues in a variety of configurations. Hopefully when I get to the cause of this in one app, Smolt, we can fix the others too. The app in question is Smolt, which uses TurboGears, SQLAlchemy with a MySQL backend, and simplejson for message passing between the server and client. Smolt takes voluntary hardware reports from its clients, and generally is configured to submit around the beginning of the month. Normally, our main data is cached by some separate processes that run short term, so we don't see any rapid memory growth, except for the beginning of each month, which makes isolating the problem to a few function calls fairly simple. To watch for memory growth, I simply have a client hammer the server with 1-3 threads submitting information simultaneously, 100 times, with a few deletion operations in between. To monitor for memory leaks, I'm using Heapy. To insert Heapy into the process, instead of calling 'start_server', a cherrypy method that does what you think it does and blocks, I'm using the module 'threading' to push it into a new thread. Using the process in heapy's documentation, I find that after running a single thread, there is about 1200 bytes of leaked memory. Despite this, the python process running the server has managed to grow from 16-18MB to something between 23-28MB each time I try this. After a second iteration, heapy shows 1168 bytes leaked. If heapy is correct, this means there are not many leaked objects in the python space. Running a larger example, say 100 threads, for a total of 10k submissions takes about an hour, and in the process, python baloons up to about 48MB. Still no signs of any missing objects. 48MB is not alot relatively speaking, but no amount of waiting seems to show python giving back that memory afterwards. On our production server, we have up to 200k machines all updating their information over a 3 day period, in which the server process manages to reach 600MB before we forcefully restart it. A couple of developers have mentioned that python might be fragmenting its memory space, and is unable to free up those pages. How can I go about testing for this, and are there any known problems like this? If not, what else can I do to look for leaks? -Yaakov -- http://mail.python.org/mailman/listinfo/python-list