Gregory Matthews wrote: > Hello again. > > Is Apache::Leak the easiest/best module to use for both detecting AND > allowing us to find the source of a memory leak in mod_perl? > > If so, I am not finding any good documentation on its use. I am not a > mod_perl guru and what I've read so far sounds rather involved. > > Can someone point me to a location where good, laymen documentation > exists for this module. I would love to use it to ensure my code is > solid (I am writing a mod_perl app from scratch and do not want to stray > off the wrong coding path).
There is not much documentation online regarding this. But as Perrin has replied in the other thread, you should worry much about leaks, when you don't mess with circular references, autovivified variables and reset your globals. Here is the relevant section from our book, which should be published really soon now. It looks like the book is going to be called "Practical mod_perl". If you notice some wrong/unclear details or missing things, please let me know while we still can correct things. =head2 Memory Leakage It's normal for a process to grow when it processes its first few requests. They may be different requests, or the same requests processing different data. You may try to reload the same request a few times and in many cases the process will stop growing after only the second reload. In any case, once a representative selection of requests and inputs have been executed by a process, it won't usually grow any more unless the code leaks memory. If it grows after each reload of an identical request, then there is probably a memory leak. The experience might be different if the code works with some external resource which can change between requests. For example if the code retrieves database records matching some query, it's possible that from time to time the database will be updated and that a different number of records will match the same query the next time it is issued. Depending on the techniques which you use to retrieve the data, format it and send it to the user, the process may grow more or less in size reflecting the changes in the data. The easiest way to see whether the code is leaking is to run the server in single process mode (C<httpd -X>), issuing the same request a few times and see whether the process grows after each request. If it does, you probably have a memory leak. If the code leaks 5kB per request, after 1000 requests to run the leaking code there will be 5MB of memory leaked. If in production you have 20 processes then this could possibly lead to 100MB of leakage after a few tens of thousands of requests. This technique to detect leakage can be misleading if you are not careful. Suppose your process first runs some clean (non-leaking) code which acquires 100kB of memory. In an attempt to make itself more efficient, Perl doesn't give the 100kB memory back to the operating system. The next time the process runs I<any> script, some of the 100kB will be reused. But if this time the process runs a script that needs to acquire only 5kB, you won't see the process grow even if the code has actually leaked these 5kB. Now it might take 20 or more requests for the leaking script I<served by the same process> before you would see that process start growing again. A process may leak memory for several reasons: badly written system C/C++ libraries used in the httpd binary and badly written Perl code are the most common. Perl modules may also use C libraries, and these might leak memory as well. Some operating systems have been known to have problems with their memory management functions. If you know that you have no leaks in your code, for detecting leaks in C/C++ libraries you should either use the technique of sampling the memory usage described above, or use C/C++ developer tools designed for this purpose. This topic is beyond the scope of this book. The C<Apache::Leak> module (derived from C<Devel::Leak>) might help you to detect leaks in your code. For example: file:leaktest.pl ---------------- use Apache::Leak; my $global = "FooA"; leak_test { $$global = 1; ++$global; }; The argument to C<leak_test()> is an anonymous sub or a block, so you can just throw in any code you suspect might be leaking. Beware, it will run the code twice! The first time in, new C<SV>s are created, but this does not mean the code is leaking. The second pass will give better evidence. You do not need to be inside mod_perl to use it. From the command line, the above script outputs: ENTER: 1482 SVs new c28b8 : new c2918 : LEAVE: 1484 SVs ENTER: 1484 SVs new db690 : new db6a8 : LEAVE: 1486 SVs !!! 2 SVs leaked !!! This module uses the simple approach of walking the Perl internal table of allocated I<Scalar Values> (SVs). It records them before entering the scope of the code under test and after leaving the scope. At the end a comparison of the two sets is performed, sv_dump() is called for any I<things> which did not exist in the first set and the difference in counts is reported. Notice that you will only see the dumps of SVs if Perl was built with C<-DDEBUGGING> option. In our example it will dump two SVs twice, since the same code is run twice. The volume of output is too great to be presented here. Our example leaks because C<$$global = 1;> creates a new global variable C<FooA> (with the value of C<1>) which will not be destroyed until this module is destroyed. Under mod_perl the module doesn't get destroyed until the process quits. When the code is run the second time, C<$global> will contain I<FooB> because of the increment code at the end of the first run. Consider: $foo = "AAA"; print "$foo\n"; $foo++; print "$foo\n"; which prints: AAA AAB So every time the code is be executed a new variable (I<FooC>, I<FooD> etc.) will spring into existence. C<Apache::Leak> is not very user-friendly. You may want to take a look at C<B::LexInfo>. It is possible to see something that might appear to be a leak, but is actually just a Perl optimization. e.g. consider this code: sub test { my ($string) = @_;} test("a string"); C<B::LexInfo> will show you that Perl does not release the value from $string, unless you undef() it. This is because Perl anticipates the memory will be needed for another string, the next time the subroutine is entered. You'll see similar behavior for C<@array> length, C<%hash> keys, and scratch areas of the pad-list for operations such as C<join()>, `C<.>', etc. Let's look at how C<B::LexInfo> works: file:leaktest1.pl ---------------- package LeakTest1; use B::LexInfo (); sub test { my ($string) = @_;} my $lexi = B::LexInfo->new; my $diff = $lexi->cvrundiff('LeakTest1::test', "a string"); print $$diff; This code creates a new C<B::LexInfo> object, and then runs cvrundiff() which creates two snapshots of the lexical variables' padlists--one before LeakTest1::test() is called and the other in this case after it has been called with an argument I<"a string">. Then it calls C<diff -u> to generate the difference between the snapshots. In case you aren't familiar with how C<diff> works: C<-> at the beginning of the line means that that line was removed, C<+> means that a line was added, other lines are there to show the context in which the difference was found. Here is the output: --- /tmp/B_LexInfo_3099.before Tue Feb 13 20:09:52 2001 +++ /tmp/B_LexInfo_3099.after Tue Feb 13 20:09:52 2001 @@ -2,9 +2,11 @@ { 'LeakTest1::test' => { '$string' => { - 'TYPE' => 'NULL', + 'TYPE' => 'PV', + 'LEN' => 9, 'ADDRESS' => '0x8146d80', - 'NULL' => '0x8146d80' + 'PV' => 'a string', + 'CUR' => 8 }, '__SPECIAL__1' => { 'TYPE' => 'NULL', Perl tries to optimize the speed by keeping the memory for C<$string> allocated, even after the variable was destroyed. If we run the first example with C<B::LexInfo>: file:leaktest2.pl ----------------- package LeakTest2; use B::LexInfo (); my $global = "FooA"; sub test { $$global = 1; ++$global; } my $lexi = B::LexInfo->new; my $diff = $lexi->cvrundiff('LeakTest2::test'); print $$diff; and the result: --- /tmp/B_LexInfo_3103.before Tue Feb 13 20:12:04 2001 +++ /tmp/B_LexInfo_3103.after Tue Feb 13 20:12:04 2001 @@ -5,7 +5,7 @@ 'TYPE' => 'PV', 'LEN' => 5, 'ADDRESS' => '0x80572ec', - 'PV' => 'FooA', + 'PV' => 'FooB', 'CUR' => 4 } } We can clearly see the leakage, since the value of C<PV> entry has changed from one string to a different one. Compare this with the previous example, where a variable didn't exist and sprang into existence for optimization reasons. If you are still confused, probably the best approach is to run the C<diff> twice when you test your code. Running the cvrundiff() function twice in both our examples: file:leaktest3.pl ----------------- package LeakTest2; use B::LexInfo (); my $global = "FooA"; sub test { $$global = 1; ++$global; } my $lexi = B::LexInfo->new; my $diff = $lexi->cvrundiff('LeakTest2::test'); $diff = $lexi->cvrundiff('LeakTest2::test'); print $$diff; and the output: --- /tmp/B_LexInfo_3103.before Tue Feb 13 20:12:04 2001 +++ /tmp/B_LexInfo_3103.after Tue Feb 13 20:12:04 2001 @@ -5,7 +5,7 @@ 'TYPE' => 'PV', 'LEN' => 5, 'ADDRESS' => '0x80572ec', - 'PV' => 'FooB', + 'PV' => 'FooC', 'CUR' => 4 } } We can see the leak again, since the value of C<PV> has changed again: from I<FooB> and I<FooC>. And if we look at the second case: file:leaktest4.pl ----------------- package LeakTest1; use B::LexInfo (); sub test { my ($string) = @_;} my $lexi = B::LexInfo->new; my $diff = $lexi->cvrundiff('LeakTest1::test', "a string"); $diff = $lexi->cvrundiff('LeakTest1::test', "a string"); print $$diff; no output is produced, since there is no difference between the second and the third run. All the data structures are allocated during the first execution, so we are sure that no memory is leaking here. C<Apache::Status> includes a C<StatusLexInfo> option which can show you the internals of your code via C<B::LexInfo>. See Chapter [XREF=debug.pod]. __________________________________________________________________ Stas Bekman JAm_pH ------> Just Another mod_perl Hacker http://stason.org/ mod_perl Guide ---> http://perl.apache.org mailto:[EMAIL PROTECTED] http://use.perl.org http://apacheweek.com http://modperlbook.org http://apache.org http://ticketmaster.com