Hi list,

Firstly, this is my first post here, so I hope I'm not breaking some
unwritten etiquette rule about asking questions involving several
different libraries.

I'm trying to plug some memory leaks in a TurboGears program.  We (the
Fedora Project) have a few apps in Turbogears in infrastructure that
all seem to be running into the same issues in a variety of
configurations.  Hopefully when I get to the cause of this in one app,
Smolt, we can fix the others too.

The app in question is Smolt, which uses TurboGears, SQLAlchemy with a
MySQL backend, and simplejson for message passing between the server
and client.  Smolt takes voluntary hardware reports from its clients,
and generally is configured to submit around the beginning of the
month.  Normally, our main data is cached by some separate processes
that run short term, so we don't see any rapid memory growth, except
for the beginning of each month, which makes isolating the problem to
a few function calls fairly simple.  To watch for memory growth, I
simply have a client hammer the server with 1-3 threads submitting
information simultaneously, 100 times, with a few deletion operations
in between.  To monitor for memory leaks, I'm using Heapy.

To insert Heapy into the process, instead of calling 'start_server', a
cherrypy method that does what you think it does and blocks, I'm using
the module 'threading' to push it into a new thread.  Using the
process in heapy's documentation, I find that after running a single
thread, there is about 1200 bytes of leaked memory.  Despite this, the
python process running the server has managed to grow from 16-18MB to
something between 23-28MB each time I try this.  After a second
iteration, heapy shows 1168 bytes leaked.  If heapy is correct, this
means there are not many leaked objects in the python space.  Running
a larger example, say 100 threads, for a total of 10k submissions
takes about an hour, and in the process, python baloons up to about
48MB.  Still no signs of any missing objects.

48MB is not alot relatively speaking, but no amount of waiting seems
to show python giving back that memory afterwards.  On our production
server, we have up to 200k machines all updating their information
over a 3 day period, in which the server process manages to reach
600MB before we forcefully restart it.

A couple of developers have mentioned that python might be fragmenting
its memory space, and is unable to free up those pages.  How can I go
about testing for this, and are there any known problems like this?
If not, what else can I do to look for leaks?

-Yaakov
-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to