Re: [jira] Created: (MODPYTHON-184) Memory leak apache.table()

2006-08-16 Thread Jim Gallacher
Alexis Marrero wrote:
 Jim,
 
 This are my results for the memory leak search in apache.table().
 
 The table object creates a memory pool by using apr_pool_create_ex() and
 destroys the pool using apr_pool_destroy(). I added a line in
 MpTable_New() before return (PyObject*)t to destroy the pool and ran
 1M iterations and I notice that there was no memory leak.  Therefore the
 apache functions seems to be working fine.

Actually I don't think apr_pool_destroy() in table_dealloc is actually
destroying the pool. I've been poking around in the code and there is
something odd going on here.

I tried registering a cleanup in MpTable_New() using:

apr_pool_cleanup_register(t-pool,
  pool cleanup called,
   cleanup_test,
  apr_pool_cleaunp_null);

The cleanup_test callback just logs the pool cleanup called message to
a file.

apr_pool_destroy() is getting called in table_dealloc, but cleanup_test
never gets called which indicates that the pool is *not* being
destroyed, and hence our memory leak.

I tried your trick of immediately calling apr_pool_destroy in
MpTable_New(), and cleanup_test does get called there.

So, the big question is... why is the pool not being destroyed?

Can anyone offer some insight here?

The attached diff is for trunk if anyone wants to play around with it.

Jim
Index: tableobject.c
===
--- tableobject.c   (revision 431994)
+++ tableobject.c   (working copy)
@@ -59,6 +59,19 @@
 return (PyObject *)result;
 }
 
+
+
+apr_status_t cleanup_test(void *msg)
+{
+FILE *f;
+f = fopen(/tmp/debug_table.log, a+);
+fprintf(f, %s\n, (char *)msg);
+fclose(f);
+
+return 0;
+}
+
+
 /** 
  ** MpTable_New
  **
@@ -78,6 +91,8 @@
 tableobject *t;
 apr_pool_t *p;
 
+cleanup_test(MpTable_New() called);
+
 /* XXX need second arg abort function to report mem error */
 apr_pool_create_ex(p, NULL, NULL, NULL);
 
@@ -86,7 +101,12 @@
 
 /* remember the pointer to our own pool */
 t-pool = p;
+apr_pool_cleanup_register(p,   pool cleanup called, cleanup_test, 
apr_pool_cleanup_null);
 
+/* Uncomment this to test that cleanup_test is getting called correctly.
+apr_pool_destroy(t-pool);
+*/
+
 return (PyObject *)t;
 
 }
@@ -99,10 +119,13 @@
 
 static void table_dealloc(register tableobject *self)
 {  
+cleanup_test(table_dealloc:);
 
 if (MpTable_Check(self)) {
-if (self-pool) 
+if (self-pool) { 
+cleanup_test(  preparing to destroy the pool);
 apr_pool_destroy(self-pool);
+}
 PyObject_Del(self);
 }
 else


Re: [jira] Created: (MODPYTHON-184) Memory leak apache.table()

2006-08-16 Thread Justin Erenkrantz

On 8/16/06, Jim Gallacher [EMAIL PROTECTED] wrote:

Actually I don't think apr_pool_destroy() in table_dealloc is actually
destroying the pool. I've been poking around in the code and there is
something odd going on here.


I would actually love to test this, but I can't build trunk on Mac OS X.

The definitions of PyLFS (LINKFORSHARED) refer to values that I don't have:

LINKFORSHARED= -u __dummy -u _PyMac_Error -framework System
$(PYTHONFRAMEWORKDIR)/Versions/$(VERSION)/$(PYTHONFRAMEWORK)
-framework CoreServices -framework Foundation

That's not the right line.  It should be looking at LDSHARED and
PYTHONFRAMEWORK instead.

Ideally, we should switch to a format like what Subversion uses to
detect the values: which queries Python directly.  See:

http://svn.collab.net/repos/svn/trunk/build/get-py-info.py

The sed magic going on in autoconf is just too creaky.  (Subversion
builds against Python on Mac OS X just fine.)

If we import get-py-info.py to MP, we should prepend it with the
license block from:

http://svn.collab.net/repos/svn/trunk/subversion/LICENSE

(That script doesn't have a license block, but it really should.  I'll
poke other SVN devs about fixing that, but importing that LICENSE
block is a safe compromise in the meantime.)

Thanks.  -- justin


Re: New module importer - why not make it the default in 3.3?

2006-08-16 Thread Dan Eloff

The new importer gets my vote.

I've been using it for a while now in my development servers and it
works great. I've not discovered any bugs. I've verified it with
PythonAutoReload and PythonDebug in any combination of On and Off. For
a complex hierarchy of python files in with both set to off, I've
verified it doesn't touch the filesystem. For both set to on it
touches the filesystem around 700 times (still  1 second), I wouldn't
recommend that people do that on servers without lots of spare time.
It's a real time saver for development though! I hardly ever have to
restart apache anymore.

-Dan

On 8/12/06, Jim Gallacher [EMAIL PROTECTED] wrote:

Graham,

In your JIRA cleanup session you made the comment a number of times that
the new importer is not likely to be the default in 3.3. I'm just
wondering why it can't be the default, with the old importer as an
option? Maybe:

PythonOption mod_python.legacy.importer *

If we were to do this we'd likely want a longer 3.3 beta test period,
but if it takes care of the ongoing module import issues it might be
worth it.

Jim



Building on Mac OS X was [jira] Created: (MODPYTHON-184) Memory leak apache.table()

2006-08-16 Thread Justin Erenkrantz

On 8/16/06, Graham Dumpleton [EMAIL PROTECTED] wrote:

 I would actually love to test this, but I can't build trunk on Mac OS X.

Huh!

Do you have more than one version of Python installed?


Nope.  I just have Python from /usr/bin/python.  Nothing special.


I do all my work on Mac OS X and have no problem. I only have the standard
OSE supplied version of Python installed though.


Not sure what you mean by OSE - do you mean OS?  Anyway, I just rely
on what comes with the OS.

I'm sure it worked at some point, but that's probably years ago.

I have:

Python 2.3.5 (#1, Mar 20 2005, 20:38:20)
[GCC 3.3 20030304 (Apple Computer, Inc. build 1809)] on darwin


FWIW, am using Mac OS X 10.4.7 at present.


So am I.

mod_python.so doesn't link as the LINKFORSHARED values don't expand
properly.  It's not even the right value to be bringing in from
Python's Makefile!


Because of other stuff, have been ignoring the table object issues, but maybe
I should start paying more attention. :-)


I'm more than happy to fix up autoconf to stop doing the insanity with
sed and determine those values within Python directly.  I'll try to do
so this weekend: time permitting.  (Greg and I and the other SVN devs
solved this problem already within Subversion.)

However, since I can't build, this means that I can't debug this pool
problem.  =(  -- justin


Re: Building on Mac OS X was [jira] Created: (MODPYTHON-184) Memory leak apache.table()

2006-08-16 Thread Graham Dumpleton
Justin Erenkrantz wrote ..
 On 8/16/06, Graham Dumpleton [EMAIL PROTECTED] wrote:
   I would actually love to test this, but I can't build trunk on Mac
 OS X.
 
  Huh!
 
  Do you have more than one version of Python installed?
 
 Nope.  I just have Python from /usr/bin/python.  Nothing special.

Strange.

  I do all my work on Mac OS X and have no problem. I only have the standard
  OSE supplied version of Python installed though.
 
 Not sure what you mean by OSE - do you mean OS?  Anyway, I just rely
 on what comes with the OS.

I meant OS. My slip up was because I actually do make available a product
called OSE and so have a predilection for typing that. Same sort of thing
when I have to use cvs at one of my jobs, I always keep typing 'svn'. :-)

 mod_python.so doesn't link as the LINKFORSHARED values don't expand
 properly.  It's not even the right value to be bringing in from
 Python's Makefile!
 
  Because of other stuff, have been ignoring the table object issues, but
 maybe
  I should start paying more attention. :-)
 
 I'm more than happy to fix up autoconf to stop doing the insanity with
 sed and determine those values within Python directly.  I'll try to do
 so this weekend: time permitting.  (Greg and I and the other SVN devs
 solved this problem already within Subversion.)

Would what you have in mind address the following in any way:

  https://issues.apache.org/jira/browse/MODPYTHON-20

Or is that a separate unrelated problem?

Graham


Re: New module importer - why not make it the default in 3.3?

2006-08-16 Thread Graham Dumpleton
Dan Eloff wrote ..
 The new importer gets my vote.
 
 I've been using it for a while now in my development servers and it
 works great. I've not discovered any bugs. I've verified it with
 PythonAutoReload and PythonDebug in any combination of On and Off. For
 a complex hierarchy of python files in with both set to off, I've
 verified it doesn't touch the filesystem. For both set to on it
 touches the filesystem around 700 times (still  1 second), I wouldn't
 recommend that people do that on servers without lots of spare time.
 It's a real time saver for development though! I hardly ever have to
 restart apache anymore.

Hmm, 700 seems a lot. How are you determining that?

Right at the end of your response handler, can you add:

  from mod_python import importer
  apache.log_error(modules visited %d + len(importer._get_modules_cache())

What this is doing is accessing a special per request module cache used
to avoid problems with a module being changed part way through a
request and two bits of code loading different versions. As a consequence,
seeing how many modules are held in that, will tell you how many modules
come into play with the current request.

If you want to know what these modules actually are, you can do something
like:

  for module in importer._get_modules_cache().values():
apache.log_error(module %s % module.__file__)

One of things I used to do when I was testing this, was to use some of the
dependency information to produce 'dot' graph definition files and view them
in GraphViz on the Mac. Quite interesting. I should resurrect the code and
post it on the mailing list so people can play with it. I should also go back
and reaudit the code again to see how many stats are being done and
whether I am doubling up.

Graham

  


HTML formatted traceback

2006-08-16 Thread Dan Eloff

I grew tired of scrolling down through the very long tracbacks to get
to the actual error, so I wrote a function that lets me skip
everything up to my handler function. While I was at it I figured I
may as well highlight the things I want to see to make the tracebacks
easier to read. When I was finished I liked it so much I had to share,
it would be nice if this found it's way into mod_python. To adapt it
for your purposes, change the condition func == 'handler' to something
else (you probably don't want to cut out the call to handler like me.)
Please excuse my lousy html, I wrote it to be small so it didn't
clutter up my source, I hate html in my python code. Feel free to
change whatever you like.

-Dan

import sys, os, re, traceback

re_traceback = re.compile(r'File (?Pfile[^]+?), line
(?Pline\d+?), in (?Pfunc.+?)\s+(?Pcontext.+?)$',
 re.DOTALL | re.MULTILINE)

def get_formatted_traceback(req):
  ''' Returns an html formatted traceback for the exception. '''

  req.content_type = 'text/html'

  buf = ['!DOCTYPE HTML PUBLIC -//W3C//DTD HTML 4.01 Transitional//EN',
 'http://www.w3.org/TR/html4/loose.dtd;',
 'htmlhead',
 'style type=text/css',
 'b { color: navy }',
 'strong { color: green }',
 '.context { font-style: italic; margin-left: 30px }',
 '.trace { margin: 10px }'
 'span { font-weight: bold; color: blue }',
 '/style',
 '/headbody']

  begin_trace = False
  for e in traceback.format_exception(*sys.exc_info()):
 m = re_traceback.search(e)
 if m is not None:
path, line, func, context = m.group('file', 'line', 'func', 'context')
path, filename = os.path.split(path)
if begin_trace:
   buf.append('divFile %s\b%s/b, line %s, in
strong%s/strong\r\ndiv class=context%s/div/div' \
  % (path, filename, line, func, context))
else:
   # Ignore the stuff up to and including the entry point of
the application at the call to handler
   if func == 'handler': begin_trace = True
 else:
if begin_trace: buf.append('/divspan%s/span' % e) #
Last Line (the error)
else: buf.append('%sdiv class=trace' % e) # First line (Traceback:)

  buf.append('/body/html')
  return '\r\n'.join(buf)


Re: HTML formatted traceback

2006-08-16 Thread Graham Dumpleton
Dan Eloff wrote ..
 I grew tired of scrolling down through the very long tracbacks to get
 to the actual error, so I wrote a function that lets me skip
 everything up to my handler function. While I was at it I figured I
 may as well highlight the things I want to see to make the tracebacks
 easier to read. When I was finished I liked it so much I had to share,
 it would be nice if this found it's way into mod_python. To adapt it
 for your purposes, change the condition func == 'handler' to something
 else (you probably don't want to cut out the call to handler like me.)
 Please excuse my lousy html, I wrote it to be small so it didn't
 clutter up my source, I hate html in my python code. Feel free to
 change whatever you like.

I'll agree that mod_python 3.3 tracebacks are perhaps longer than they
need to be given that I factored out various stuff into some functions
so as to be able to reuse it across connection, handler and filter dispatch.
I also like the idea of using colour to highlight important stuff.

Thus, probably a good time to upgrade the standard traceback page
that mod_python generates when there is an error. In others words,
incorporate your ideas into the CallBack.ReportError function in the
mod_python.apache module. When integrated, will have to be a bit
more tricky to deal with fact that could be any phase and handler name
for phase can be overridden. The information to do all that should be
available though.

Anyone else got any other ideas of what to put in the page? For example,
perhaps stuff like what req.filename, req.path_info, req.uri etc are set to.

Once we get a feel what would be good to see, I'll create a JIRA issue for
it and reference the discussion in the mailing list archive.

With all my talk of dependency graphs, am thinking I could also put that
code in mod_python.testhandler module so as to be optionally used by
people as an aid for debugging applications as well.

Graham



Re: HTML formatted traceback

2006-08-16 Thread Dan Eloff

I'll agree that mod_python 3.3 tracebacks are perhaps longer than they
need to be given that I factored out various stuff into some functions
so as to be able to reuse it across connection, handler and filter dispatch.
I also like the idea of using colour to highlight important stuff.

Thus, probably a good time to upgrade the standard traceback page
that mod_python generates when there is an error. In others words,
incorporate your ideas into the CallBack.ReportError function in the
mod_python.apache module. When integrated, will have to be a bit
more tricky to deal with fact that could be any phase and handler name
for phase can be overridden. The information to do all that should be
available though.


I'll get it working for regular handlers, and if the others are
non-obvious I'll get help on this mailing list.


Anyone else got any other ideas of what to put in the page? For example,
perhaps stuff like what req.filename, req.path_info, req.uri etc are set to.


That kind of thing should go below the traceback, so as to prevent the
problem of having to scroll to see the error from being repeated. I
wonder if it would be useful to present a formatted dump of lots of
information in the req, server, connection objects.


Once we get a feel what would be good to see, I'll create a JIRA issue for
it and reference the discussion in the mailing list archive.

With all my talk of dependency graphs, am thinking I could also put that
code in mod_python.testhandler module so as to be optionally used by
people as an aid for debugging applications as well.


Yes, good idea.

-Dan