Re: [jira] Created: (MODPYTHON-184) Memory leak apache.table()
Alexis Marrero wrote: Jim, This are my results for the memory leak search in apache.table(). The table object creates a memory pool by using apr_pool_create_ex() and destroys the pool using apr_pool_destroy(). I added a line in MpTable_New() before return (PyObject*)t to destroy the pool and ran 1M iterations and I notice that there was no memory leak. Therefore the apache functions seems to be working fine. Actually I don't think apr_pool_destroy() in table_dealloc is actually destroying the pool. I've been poking around in the code and there is something odd going on here. I tried registering a cleanup in MpTable_New() using: apr_pool_cleanup_register(t-pool, pool cleanup called, cleanup_test, apr_pool_cleaunp_null); The cleanup_test callback just logs the pool cleanup called message to a file. apr_pool_destroy() is getting called in table_dealloc, but cleanup_test never gets called which indicates that the pool is *not* being destroyed, and hence our memory leak. I tried your trick of immediately calling apr_pool_destroy in MpTable_New(), and cleanup_test does get called there. So, the big question is... why is the pool not being destroyed? Can anyone offer some insight here? The attached diff is for trunk if anyone wants to play around with it. Jim Index: tableobject.c === --- tableobject.c (revision 431994) +++ tableobject.c (working copy) @@ -59,6 +59,19 @@ return (PyObject *)result; } + + +apr_status_t cleanup_test(void *msg) +{ +FILE *f; +f = fopen(/tmp/debug_table.log, a+); +fprintf(f, %s\n, (char *)msg); +fclose(f); + +return 0; +} + + /** ** MpTable_New ** @@ -78,6 +91,8 @@ tableobject *t; apr_pool_t *p; +cleanup_test(MpTable_New() called); + /* XXX need second arg abort function to report mem error */ apr_pool_create_ex(p, NULL, NULL, NULL); @@ -86,7 +101,12 @@ /* remember the pointer to our own pool */ t-pool = p; +apr_pool_cleanup_register(p, pool cleanup called, cleanup_test, apr_pool_cleanup_null); +/* Uncomment this to test that cleanup_test is getting called correctly. +apr_pool_destroy(t-pool); +*/ + return (PyObject *)t; } @@ -99,10 +119,13 @@ static void table_dealloc(register tableobject *self) { +cleanup_test(table_dealloc:); if (MpTable_Check(self)) { -if (self-pool) +if (self-pool) { +cleanup_test( preparing to destroy the pool); apr_pool_destroy(self-pool); +} PyObject_Del(self); } else
Re: [jira] Created: (MODPYTHON-184) Memory leak apache.table()
On 8/16/06, Jim Gallacher [EMAIL PROTECTED] wrote: Actually I don't think apr_pool_destroy() in table_dealloc is actually destroying the pool. I've been poking around in the code and there is something odd going on here. I would actually love to test this, but I can't build trunk on Mac OS X. The definitions of PyLFS (LINKFORSHARED) refer to values that I don't have: LINKFORSHARED= -u __dummy -u _PyMac_Error -framework System $(PYTHONFRAMEWORKDIR)/Versions/$(VERSION)/$(PYTHONFRAMEWORK) -framework CoreServices -framework Foundation That's not the right line. It should be looking at LDSHARED and PYTHONFRAMEWORK instead. Ideally, we should switch to a format like what Subversion uses to detect the values: which queries Python directly. See: http://svn.collab.net/repos/svn/trunk/build/get-py-info.py The sed magic going on in autoconf is just too creaky. (Subversion builds against Python on Mac OS X just fine.) If we import get-py-info.py to MP, we should prepend it with the license block from: http://svn.collab.net/repos/svn/trunk/subversion/LICENSE (That script doesn't have a license block, but it really should. I'll poke other SVN devs about fixing that, but importing that LICENSE block is a safe compromise in the meantime.) Thanks. -- justin
Re: New module importer - why not make it the default in 3.3?
The new importer gets my vote. I've been using it for a while now in my development servers and it works great. I've not discovered any bugs. I've verified it with PythonAutoReload and PythonDebug in any combination of On and Off. For a complex hierarchy of python files in with both set to off, I've verified it doesn't touch the filesystem. For both set to on it touches the filesystem around 700 times (still 1 second), I wouldn't recommend that people do that on servers without lots of spare time. It's a real time saver for development though! I hardly ever have to restart apache anymore. -Dan On 8/12/06, Jim Gallacher [EMAIL PROTECTED] wrote: Graham, In your JIRA cleanup session you made the comment a number of times that the new importer is not likely to be the default in 3.3. I'm just wondering why it can't be the default, with the old importer as an option? Maybe: PythonOption mod_python.legacy.importer * If we were to do this we'd likely want a longer 3.3 beta test period, but if it takes care of the ongoing module import issues it might be worth it. Jim
Building on Mac OS X was [jira] Created: (MODPYTHON-184) Memory leak apache.table()
On 8/16/06, Graham Dumpleton [EMAIL PROTECTED] wrote: I would actually love to test this, but I can't build trunk on Mac OS X. Huh! Do you have more than one version of Python installed? Nope. I just have Python from /usr/bin/python. Nothing special. I do all my work on Mac OS X and have no problem. I only have the standard OSE supplied version of Python installed though. Not sure what you mean by OSE - do you mean OS? Anyway, I just rely on what comes with the OS. I'm sure it worked at some point, but that's probably years ago. I have: Python 2.3.5 (#1, Mar 20 2005, 20:38:20) [GCC 3.3 20030304 (Apple Computer, Inc. build 1809)] on darwin FWIW, am using Mac OS X 10.4.7 at present. So am I. mod_python.so doesn't link as the LINKFORSHARED values don't expand properly. It's not even the right value to be bringing in from Python's Makefile! Because of other stuff, have been ignoring the table object issues, but maybe I should start paying more attention. :-) I'm more than happy to fix up autoconf to stop doing the insanity with sed and determine those values within Python directly. I'll try to do so this weekend: time permitting. (Greg and I and the other SVN devs solved this problem already within Subversion.) However, since I can't build, this means that I can't debug this pool problem. =( -- justin
Re: Building on Mac OS X was [jira] Created: (MODPYTHON-184) Memory leak apache.table()
Justin Erenkrantz wrote .. On 8/16/06, Graham Dumpleton [EMAIL PROTECTED] wrote: I would actually love to test this, but I can't build trunk on Mac OS X. Huh! Do you have more than one version of Python installed? Nope. I just have Python from /usr/bin/python. Nothing special. Strange. I do all my work on Mac OS X and have no problem. I only have the standard OSE supplied version of Python installed though. Not sure what you mean by OSE - do you mean OS? Anyway, I just rely on what comes with the OS. I meant OS. My slip up was because I actually do make available a product called OSE and so have a predilection for typing that. Same sort of thing when I have to use cvs at one of my jobs, I always keep typing 'svn'. :-) mod_python.so doesn't link as the LINKFORSHARED values don't expand properly. It's not even the right value to be bringing in from Python's Makefile! Because of other stuff, have been ignoring the table object issues, but maybe I should start paying more attention. :-) I'm more than happy to fix up autoconf to stop doing the insanity with sed and determine those values within Python directly. I'll try to do so this weekend: time permitting. (Greg and I and the other SVN devs solved this problem already within Subversion.) Would what you have in mind address the following in any way: https://issues.apache.org/jira/browse/MODPYTHON-20 Or is that a separate unrelated problem? Graham
Re: New module importer - why not make it the default in 3.3?
Dan Eloff wrote .. The new importer gets my vote. I've been using it for a while now in my development servers and it works great. I've not discovered any bugs. I've verified it with PythonAutoReload and PythonDebug in any combination of On and Off. For a complex hierarchy of python files in with both set to off, I've verified it doesn't touch the filesystem. For both set to on it touches the filesystem around 700 times (still 1 second), I wouldn't recommend that people do that on servers without lots of spare time. It's a real time saver for development though! I hardly ever have to restart apache anymore. Hmm, 700 seems a lot. How are you determining that? Right at the end of your response handler, can you add: from mod_python import importer apache.log_error(modules visited %d + len(importer._get_modules_cache()) What this is doing is accessing a special per request module cache used to avoid problems with a module being changed part way through a request and two bits of code loading different versions. As a consequence, seeing how many modules are held in that, will tell you how many modules come into play with the current request. If you want to know what these modules actually are, you can do something like: for module in importer._get_modules_cache().values(): apache.log_error(module %s % module.__file__) One of things I used to do when I was testing this, was to use some of the dependency information to produce 'dot' graph definition files and view them in GraphViz on the Mac. Quite interesting. I should resurrect the code and post it on the mailing list so people can play with it. I should also go back and reaudit the code again to see how many stats are being done and whether I am doubling up. Graham
HTML formatted traceback
I grew tired of scrolling down through the very long tracbacks to get to the actual error, so I wrote a function that lets me skip everything up to my handler function. While I was at it I figured I may as well highlight the things I want to see to make the tracebacks easier to read. When I was finished I liked it so much I had to share, it would be nice if this found it's way into mod_python. To adapt it for your purposes, change the condition func == 'handler' to something else (you probably don't want to cut out the call to handler like me.) Please excuse my lousy html, I wrote it to be small so it didn't clutter up my source, I hate html in my python code. Feel free to change whatever you like. -Dan import sys, os, re, traceback re_traceback = re.compile(r'File (?Pfile[^]+?), line (?Pline\d+?), in (?Pfunc.+?)\s+(?Pcontext.+?)$', re.DOTALL | re.MULTILINE) def get_formatted_traceback(req): ''' Returns an html formatted traceback for the exception. ''' req.content_type = 'text/html' buf = ['!DOCTYPE HTML PUBLIC -//W3C//DTD HTML 4.01 Transitional//EN', 'http://www.w3.org/TR/html4/loose.dtd;', 'htmlhead', 'style type=text/css', 'b { color: navy }', 'strong { color: green }', '.context { font-style: italic; margin-left: 30px }', '.trace { margin: 10px }' 'span { font-weight: bold; color: blue }', '/style', '/headbody'] begin_trace = False for e in traceback.format_exception(*sys.exc_info()): m = re_traceback.search(e) if m is not None: path, line, func, context = m.group('file', 'line', 'func', 'context') path, filename = os.path.split(path) if begin_trace: buf.append('divFile %s\b%s/b, line %s, in strong%s/strong\r\ndiv class=context%s/div/div' \ % (path, filename, line, func, context)) else: # Ignore the stuff up to and including the entry point of the application at the call to handler if func == 'handler': begin_trace = True else: if begin_trace: buf.append('/divspan%s/span' % e) # Last Line (the error) else: buf.append('%sdiv class=trace' % e) # First line (Traceback:) buf.append('/body/html') return '\r\n'.join(buf)
Re: HTML formatted traceback
Dan Eloff wrote .. I grew tired of scrolling down through the very long tracbacks to get to the actual error, so I wrote a function that lets me skip everything up to my handler function. While I was at it I figured I may as well highlight the things I want to see to make the tracebacks easier to read. When I was finished I liked it so much I had to share, it would be nice if this found it's way into mod_python. To adapt it for your purposes, change the condition func == 'handler' to something else (you probably don't want to cut out the call to handler like me.) Please excuse my lousy html, I wrote it to be small so it didn't clutter up my source, I hate html in my python code. Feel free to change whatever you like. I'll agree that mod_python 3.3 tracebacks are perhaps longer than they need to be given that I factored out various stuff into some functions so as to be able to reuse it across connection, handler and filter dispatch. I also like the idea of using colour to highlight important stuff. Thus, probably a good time to upgrade the standard traceback page that mod_python generates when there is an error. In others words, incorporate your ideas into the CallBack.ReportError function in the mod_python.apache module. When integrated, will have to be a bit more tricky to deal with fact that could be any phase and handler name for phase can be overridden. The information to do all that should be available though. Anyone else got any other ideas of what to put in the page? For example, perhaps stuff like what req.filename, req.path_info, req.uri etc are set to. Once we get a feel what would be good to see, I'll create a JIRA issue for it and reference the discussion in the mailing list archive. With all my talk of dependency graphs, am thinking I could also put that code in mod_python.testhandler module so as to be optionally used by people as an aid for debugging applications as well. Graham
Re: HTML formatted traceback
I'll agree that mod_python 3.3 tracebacks are perhaps longer than they need to be given that I factored out various stuff into some functions so as to be able to reuse it across connection, handler and filter dispatch. I also like the idea of using colour to highlight important stuff. Thus, probably a good time to upgrade the standard traceback page that mod_python generates when there is an error. In others words, incorporate your ideas into the CallBack.ReportError function in the mod_python.apache module. When integrated, will have to be a bit more tricky to deal with fact that could be any phase and handler name for phase can be overridden. The information to do all that should be available though. I'll get it working for regular handlers, and if the others are non-obvious I'll get help on this mailing list. Anyone else got any other ideas of what to put in the page? For example, perhaps stuff like what req.filename, req.path_info, req.uri etc are set to. That kind of thing should go below the traceback, so as to prevent the problem of having to scroll to see the error from being repeated. I wonder if it would be useful to present a formatted dump of lots of information in the req, server, connection objects. Once we get a feel what would be good to see, I'll create a JIRA issue for it and reference the discussion in the mailing list archive. With all my talk of dependency graphs, am thinking I could also put that code in mod_python.testhandler module so as to be optionally used by people as an aid for debugging applications as well. Yes, good idea. -Dan