Re: Memory leak/server crashes

2000-01-26 Thread Doug MacEachern

there are hints in the SUPPORT doc on how to debug such problems.  there
was also several "Hanging process" threads in the past weeks with more
tips, search in the archives for keywords gdb, .gdbinit, curinfo
if you can get more insight from those tips, we can help more.

On Sun, 9 Jan 2000, James Furness wrote:

> I'm looking for some help getting apache to run reliably. Apache 1.3.9 with
> mod_perl 1.21 and Perl 5.005_03 is running on a dual P500 with 1 Gb of RAM
> running Redhat 6.1. We run about 5 sites off the box, most of which are
> fairly high traffic, and use a lot of CGI and
> MySQL 3.22.25 is used with Apache::DBI.
> 
> The major problem seems to be a memory leak of some sort, identical to that
> described in the "memory leak in mod_perl" thread on this list from October
> 1997 and the "httpd, mod_perl and memory consumption (long)" thread from
> July 1997.
> 
> The server runs normally for several hours, then suddenly a httpd process
> starts growing exponentially, the swapfile usage grows massively and the
> server starts to become sluggish (I assume due to disk thrashing caused by
> the heavy swap usage). Usually when this started to happen I would log in
> and use apachectl stop to shutdown the server, then type 'killall httpd'
> several times till the processes finally died off, and then use apachectl
> start to restart apache. If I was not around or did not catch this, the
> server would eventually become unresponsive and lock up, requiring a manual
> reboot by the datacentre staff. Messages such as "Out of memory" and
> "Callback called exit" would appear in the error log as the server spiralled
> down and MySQL would start to have trouble running.
> 
> To combat this, I created a script to monitor load and swapfile usage, and
> restart apache as described above if load was above 7 and swapfile usage
> above 150Mb. This script has kept the server online and we now have an
> uptime of something like 22 days (previously no more than 1 day), but the
> script is getting triggered several times a day and no more "Out of memory"
> messages are appearing, but the situation is not ideal.
> 
> I have tried adding:
> 
> sub UNIVERSAL::AUTOLOAD {
> my $class = shift;
> Carp::cluck "$class can't \$UNIVERSAL::AUTOLOAD!\n";
> }
> 
> 
> As recommended by the developers guide, which flooded the error log with the
> text below being printed roughly once a second in the error log:
> 
> -
> Apache=SCALAR(0x830937c) can't $UNIVERSAL::AUTOLOAD!
> Apache=SCALAR(0x8309364) can't $UNIVERSAL::AUTOLOAD!
> DBI::DBI_tie=HASH(0x82dd16c) can't $UNIVERSAL::AUTOLOAD!
> IO::Handle=IO(0x820aabc) can't $UNIVERSAL::AUTOLOAD!
> DBI::DBI_tie=HASH(0x82dd16c) can't $UNIVERSAL::AUTOLOAD!
> IO::Handle=IO(0x820aabc) can't $UNIVERSAL::AUTOLOAD!
> DBI::DBI_tie=HASH(0x82dd16c) can't $UNIVERSAL::AUTOLOAD!
> IO::Handle=IO(0x820aabc) can't $UNIVERSAL::AUTOLOAD!
> DBI::DBI_tie=HASH(0x82dd16c) can't $UNIVERSAL::AUTOLOAD!
> IO::Handle=IO(0x820aabc) can't $UNIVERSAL::AUTOLOAD!
> --
> 
> I've pretty much exhausted any ways I can think of to trace this problem,
> such as i've tried to eliminate memory leaks in code by removing some
> scripts from mod_perl and running them under mod_cgi and i've tried tweaking
> MaxRequestsPerChild both without any success.
> 
> One thing that was mentioned in a previous thread was that using 'exit'
> could confuse perl, and exit() is used fairly heavily in the scripts since
> most are converted to mod_perl from standard CGIs, but i'd prefer not to
> have to remove these since the structure of the scripts is reliant on some
> form of exit statement. Is there some alternative to exit()?
> 
> I've also had a look at some of the patches to Apache.pm and Apache.xs
> suggested in the previous threads, and these seem to have been incorporated
> into mod_perl 1.21.
> 
> Are there any other solutions I could try to this problem? Does anyone know
> what might be causing this?
> 
> The second problem I have is when loading pages, usually CGI, but I think
> this has happened on some static pages, what IE5 describes as "Server not
> found or DNS error" is experienced. Originally I thought this was the server
> hitting MaxClients (150) since it usually occurs at the same time as massive
> surges of hits, and /server-status usually shows 150 httpd processes have
> been spawned, however I increased MaxClients to 200 recently and the error
> has continued to happen, even though /server-status doesn't show any more
> than about 170 processes spawned. I have not ruled out DNS server troubles
> or backbone problems (We've had a few routing troubles recently that slowed
> things down, but not actually cut off traffic or anything like that), but I
> am at a loss as to what else could be causing this so I thought i'd ask
> whilst i'm on the subject of server problems :)
> 
> Thanks in advance,
> --
> James Furness <[EMAIL PROTECTED]>
> ICQ #:  4663650
> 



Re: Memory leak/server crashes

2000-01-11 Thread James Furness

> > I'm looking for some help getting apache to run reliably. Apache 1.3.9
with
> > mod_perl 1.21 and Perl 5.005_03 is running on a dual P500 with 1 Gb of
RAM
> > running Redhat 6.1. We run about 5 sites off the box, most of which are
> > fairly high traffic, and use a lot of CGI and
> > MySQL 3.22.25 is used with Apache::DBI.
> >
> > The major problem seems to be a memory leak of some sort, identical to
that
> > described in the "memory leak in mod_perl" thread on this list from
October
> > 1997 and the "httpd, mod_perl and memory consumption (long)" thread from
> > July 1997.
>
> [snip]
>
> I too have had this problem and haven't found a suitable solution.  In
> my case, though, I think the leaks are primarily do to old perl
> scripts being run under Registry and not bugs in mod_perl or perl.

Well as I said, there is a lot of old code, which I guess could be the
culprit.

> The first thing to do is to try to discover if the problem is a
> mod_perl problem or a bad script problem.  If your server can handle
> it, you could try a binary search to find which (if any) scripts make
> the problem worse.  Basically pick half your registry scripts and use
> mod_cgi.  If leaks persist, you know that you have some problem
> scripts in the ones you didn't make mod_cgi.  If leaks stop, then you
> know the problem scripts are in the ones you made mod_cgi.  Repeat as
> necessary until you have narrowed it down to a single script.  This is
> tedious though and may not be practical.

Ok, i'll try this when I get time.

> Now, let's assume the problem is in fact in mod_perl or apache or perl
> itself.  In this case I'm not sure what the best way to proceed is.  I
> think mod_perl and perl have shown themselves to be pretty good about
> not leaking memory, as has apache.  IMO it's much, much more likely a
> problem concerning Registry and impolite scripts that are misbehaving
> and leaving parts of themselves around.

Yeah, i'm willing to believe my scripts could be a cause.

> Have you tried correlating the memory surges with any page accesses?
> That may help narrow down the culprit.

I'm not really sure how I could go about doing that, any suggestions? :)
--
James Furness <[EMAIL PROTECTED]>
ICQ #:  4663650



Re: Memory leak/server crashes

2000-01-11 Thread James Furness

> Why, reinvent the wheel? I wrote Apache::VMonitor (grab from CPAN) that
> does all this and more (all but tail -f) I use it all the time, saves me a
> lot of time, since I don't have to telnet!

Ok - I will try to look into that when I get time.

> > 2)  Open up and hack Apache::SizeLimit and have it do a stack dump
> > (Carp::croak) of what's going on... there may be some clue there.

I've done this (In a PerlRequire'd file):
--
# Use apache process size limitation
use Apache::SizeLimit;

$Apache::SizeLimit::MAX_PROCESS_SIZE = 9000;
$Apache::SizeLimit::CHECK_EVERY_N_REQUESTS = 3;
---

and 'PerlCleanupHandler Apache::SizeLimit' in httpd.conf.

The server is still getting restarted by the uptime/swapfile monitor, so I'm
not sure if this is having an effect.

I'll look into opening up sizelimit and doing a stack dump as soon as I get
time.

> Apache::GTopLimit is an advanced one :) (you are on Linux, right?) but
> Apache::SizeLimit is just file

I had some problems with GTop, I was trying to use Apache::VMonitor, I
downloaded and installed libgtop and the other packages needed (Forget which
now) and tried to install VMonitor, but it failed on make test, couldn't
locate one of the packages I definitely installed, a graphics manipulation
one from memory, but i'm writing this e-mail offline so I can't check :)

> 3) try running in single mode with 'strace' (probably not a good idea for
> a production server), but you can still strace all the processes into a
> log file

Well I might be able to get a server running in single mode on a different
port and try that, it would be worth the information gained if I can sort
this problem out :)

> 4) Apache::Leak ?

Ok, will look at that too.
--
James Furness <[EMAIL PROTECTED]>
ICQ #:  4663650



Re: Memory leak/server crashes

2000-01-09 Thread Stas Bekman

On Sun, 9 Jan 2000, Sean Chittenden wrote:

>   Yeah...  two things I'd do:
> 
>   1)  Open two telnet sessions to the box.  One for top that is
> monitoring processes for your web user (www typically) and is sorting by
> memory usage w/ a 1 second refresh.  I'd change the size of the window and
> make it pretty short so that the refreshes happen quicker, but that
> depends on your connection speed.  The second telnet window is a window
> that tails your access log (tail -f).  It sounds boring, but by watching
> the two, you should have an idea as to when the problem happens.

Why, reinvent the wheel? I wrote Apache::VMonitor (grab from CPAN) that
does all this and more (all but tail -f) I use it all the time, saves me a
lot of time, since I don't have to telnet!

>   2)  Open up and hack Apache::SizeLimit and have it do a stack dump
> (Carp::croak) of what's going on... there may be some clue there.

Apache::GTopLimit is an advanced one :) (you are on Linux, right?) but
Apache::SizeLimit is just file

3) try running in single mode with 'strace' (probably not a good idea for
a production server), but you can still strace all the processes into a
log file

4) Apache::Leak ?

___
Stas Bekmanmailto:[EMAIL PROTECTED]  http://www.stason.org/stas
Perl,CGI,Apache,Linux,Web,Java,PC http://www.stason.org/stas/TULARC
perl.apache.orgmodperl.sourcegarden.org   perlmonth.comperl.org
single o-> + single o-+ = singlesheavenhttp://www.singlesheaven.com



Re: Memory leak/server crashes

2000-01-09 Thread Sean Chittenden

Yeah...  two things I'd do:

1)  Open two telnet sessions to the box.  One for top that is
monitoring processes for your web user (www typically) and is sorting by
memory usage w/ a 1 second refresh.  I'd change the size of the window and
make it pretty short so that the refreshes happen quicker, but that
depends on your connection speed.  The second telnet window is a window
that tails your access log (tail -f).  It sounds boring, but by watching
the two, you should have an idea as to when the problem happens.
2)  Open up and hack Apache::SizeLimit and have it do a stack dump
(Carp::croak) of what's going on... there may be some clue there.

Solution #1 will probably be your best bet...  Good luck (cool
site too!).  --SC

-- 
Sean Chittenden  <[EMAIL PROTECTED]>
fingerprint = 6988 8952 0030 D640 3138  C82F 0E9A DEF1 8F45 0466

The faster I go, the behinder I get.
-- Lewis Carroll

On Sun, 9 Jan 2000, James Furness wrote:

> Date: Sun, 9 Jan 2000 21:47:03 -
> From: James Furness <[EMAIL PROTECTED]>
> Reply-To: James Furness <[EMAIL PROTECTED]>
> To: Sean Chittenden <[EMAIL PROTECTED]>
> Cc: [EMAIL PROTECTED]
> Subject: Re: Memory leak/server crashes
> 
> > Try using Apache::SizeLimit as a way of controlling your
> > processes.  Sounds like a recursive page that performs infinite internal
> > requests.
> 
> Ok, sounds like a good solution, but it still seems to me I should be
> eliminating the problem at the source. Any ideas as to how I could narrow
> down the location of whatever's causing the recursion?
> --
> James Furness <[EMAIL PROTECTED]>
> ICQ #:  4663650



Re: Memory leak/server crashes

2000-01-09 Thread Chip Turner

"James Furness" <[EMAIL PROTECTED]> writes:

> I'm looking for some help getting apache to run reliably. Apache 1.3.9 with
> mod_perl 1.21 and Perl 5.005_03 is running on a dual P500 with 1 Gb of RAM
> running Redhat 6.1. We run about 5 sites off the box, most of which are
> fairly high traffic, and use a lot of CGI and
> MySQL 3.22.25 is used with Apache::DBI.
> 
> The major problem seems to be a memory leak of some sort, identical to that
> described in the "memory leak in mod_perl" thread on this list from October
> 1997 and the "httpd, mod_perl and memory consumption (long)" thread from
> July 1997.

[snip]

I too have had this problem and haven't found a suitable solution.  In
my case, though, I think the leaks are primarily do to old perl
scripts being run under Registry and not bugs in mod_perl or perl.

The first thing to do is to try to discover if the problem is a
mod_perl problem or a bad script problem.  If your server can handle
it, you could try a binary search to find which (if any) scripts make
the problem worse.  Basically pick half your registry scripts and use
mod_cgi.  If leaks persist, you know that you have some problem
scripts in the ones you didn't make mod_cgi.  If leaks stop, then you
know the problem scripts are in the ones you made mod_cgi.  Repeat as
necessary until you have narrowed it down to a single script.  This is
tedious though and may not be practical.

Depending on how old the scripts are, I would check for non-closed
filehandles, excessive global variables, not using strict, etc.
perl-status is your friend (hopefully you have it enabled!) so you can
see the namespaces of each httpd and see if you have any candidate
variables, file handles, functions, etc that could be clogging memory.

As a last resort, you could try Apache::SizeLimit to cap the size of
each httpd daemon.  This works reasonably well for us.  Something to
the effect:

use Apache::SizeLimit;

$Apache::SizeLimit::MAX_PROCESS_SIZE = 16384; 
$Apache::SizeLimit::CHECK_EVERY_N_REQUESTS = 3;

should help cap your processes at 16meg each.  Tweak as necessary.
Read the perldoc for Apache::SizeLimit for all the info you need.

Now, let's assume the problem is in fact in mod_perl or apache or perl
itself.  In this case I'm not sure what the best way to procede is.  I
think mod_perl and perl have shown themselves to be pretty good about
not leaking memory, as has apache.  IMO it's much, much more likely a
problem concerning Registry and impolite scripts that are misbehaving
and leaving parts of themselves around.

Have you tried correlating the memory surges with any page accesses?
That may help narrow down the culprit.

Good luck!

Chip

-- 
Chip Turner   [EMAIL PROTECTED]
  Programmer, ZFx, Inc.  www.zfx.com
  PGP key available at wwwkeys.us.pgp.net



Re: Memory leak/server crashes

2000-01-09 Thread James Furness

> Try using Apache::SizeLimit as a way of controlling your
> processes.  Sounds like a recursive page that performs infinite internal
> requests.

Ok, sounds like a good solution, but it still seems to me I should be
eliminating the problem at the source. Any ideas as to how I could narrow
down the location of whatever's causing the recursion?
--
James Furness <[EMAIL PROTECTED]>
ICQ #:  4663650



Re: Memory leak/server crashes

2000-01-09 Thread Sean Chittenden

Try using Apache::SizeLimit as a way of controlling your
processes.  Sounds like a recursive page that performs infinite internal
requests.

-- 
Sean Chittenden  <[EMAIL PROTECTED]>
fingerprint = 6988 8952 0030 D640 3138  C82F 0E9A DEF1 8F45 0466

My mother once said to me, "Elwood," (she always called me Elwood)
"Elwood, in this world you must be oh so smart or oh so pleasant."
For years I tried smart.  I recommend pleasant.
-- Elwood P. Dowde, "Harvey"

On Sun, 9 Jan 2000, James Furness wrote:

> Date: Sun, 9 Jan 2000 19:58:00 -
> From: James Furness <[EMAIL PROTECTED]>
> Reply-To: James Furness <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]
> Subject: Memory leak/server crashes
> 
> I'm looking for some help getting apache to run reliably. Apache 1.3.9 with
> mod_perl 1.21 and Perl 5.005_03 is running on a dual P500 with 1 Gb of RAM
> running Redhat 6.1. We run about 5 sites off the box, most of which are
> fairly high traffic, and use a lot of CGI and
> MySQL 3.22.25 is used with Apache::DBI.
> 
> The major problem seems to be a memory leak of some sort, identical to that
> described in the "memory leak in mod_perl" thread on this list from October
> 1997 and the "httpd, mod_perl and memory consumption (long)" thread from
> July 1997.
> 
> The server runs normally for several hours, then suddenly a httpd process
> starts growing exponentially, the swapfile usage grows massively and the
> server starts to become sluggish (I assume due to disk thrashing caused by
> the heavy swap usage). Usually when this started to happen I would log in
> and use apachectl stop to shutdown the server, then type 'killall httpd'
> several times till the processes finally died off, and then use apachectl
> start to restart apache. If I was not around or did not catch this, the
> server would eventually become unresponsive and lock up, requiring a manual
> reboot by the datacentre staff. Messages such as "Out of memory" and
> "Callback called exit" would appear in the error log as the server spiralled
> down and MySQL would start to have trouble running.
> 
> To combat this, I created a script to monitor load and swapfile usage, and
> restart apache as described above if load was above 7 and swapfile usage
> above 150Mb. This script has kept the server online and we now have an
> uptime of something like 22 days (previously no more than 1 day), but the
> script is getting triggered several times a day and no more "Out of memory"
> messages are appearing, but the situation is not ideal.
> 
> I have tried adding:
> 
> sub UNIVERSAL::AUTOLOAD {
> my $class = shift;
> Carp::cluck "$class can't \$UNIVERSAL::AUTOLOAD!\n";
> }
> 
> 
> As recommended by the developers guide, which flooded the error log with the
> text below being printed roughly once a second in the error log:
> 
> -
> Apache=SCALAR(0x830937c) can't $UNIVERSAL::AUTOLOAD!
> Apache=SCALAR(0x8309364) can't $UNIVERSAL::AUTOLOAD!
> DBI::DBI_tie=HASH(0x82dd16c) can't $UNIVERSAL::AUTOLOAD!
> IO::Handle=IO(0x820aabc) can't $UNIVERSAL::AUTOLOAD!
> DBI::DBI_tie=HASH(0x82dd16c) can't $UNIVERSAL::AUTOLOAD!
> IO::Handle=IO(0x820aabc) can't $UNIVERSAL::AUTOLOAD!
> DBI::DBI_tie=HASH(0x82dd16c) can't $UNIVERSAL::AUTOLOAD!
> IO::Handle=IO(0x820aabc) can't $UNIVERSAL::AUTOLOAD!
> DBI::DBI_tie=HASH(0x82dd16c) can't $UNIVERSAL::AUTOLOAD!
> IO::Handle=IO(0x820aabc) can't $UNIVERSAL::AUTOLOAD!
> --
> 
> I've pretty much exhausted any ways I can think of to trace this problem,
> such as i've tried to eliminate memory leaks in code by removing some
> scripts from mod_perl and running them under mod_cgi and i've tried tweaking
> MaxRequestsPerChild both without any success.
> 
> One thing that was mentioned in a previous thread was that using 'exit'
> could confuse perl, and exit() is used fairly heavily in the scripts since
> most are converted to mod_perl from standard CGIs, but i'd prefer not to
> have to remove these since the structure of the scripts is reliant on some
> form of exit statement. Is there some alternative to exit()?
> 
> I've also had a look at some of the patches to Apache.pm and Apache.xs
> suggested in the previous threads, and these seem to have been incorporated
> into mod_perl 1.21.
> 
> Are there any other solutions I could try to this problem? Does anyone know
> what might be causing this?
> 
> The second problem I have is when loading pages, usually CGI, but I think
> this has happened on some static pages, what IE5 descr

Memory leak/server crashes

2000-01-09 Thread James Furness

I'm looking for some help getting apache to run reliably. Apache 1.3.9 with
mod_perl 1.21 and Perl 5.005_03 is running on a dual P500 with 1 Gb of RAM
running Redhat 6.1. We run about 5 sites off the box, most of which are
fairly high traffic, and use a lot of CGI and
MySQL 3.22.25 is used with Apache::DBI.

The major problem seems to be a memory leak of some sort, identical to that
described in the "memory leak in mod_perl" thread on this list from October
1997 and the "httpd, mod_perl and memory consumption (long)" thread from
July 1997.

The server runs normally for several hours, then suddenly a httpd process
starts growing exponentially, the swapfile usage grows massively and the
server starts to become sluggish (I assume due to disk thrashing caused by
the heavy swap usage). Usually when this started to happen I would log in
and use apachectl stop to shutdown the server, then type 'killall httpd'
several times till the processes finally died off, and then use apachectl
start to restart apache. If I was not around or did not catch this, the
server would eventually become unresponsive and lock up, requiring a manual
reboot by the datacentre staff. Messages such as "Out of memory" and
"Callback called exit" would appear in the error log as the server spiralled
down and MySQL would start to have trouble running.

To combat this, I created a script to monitor load and swapfile usage, and
restart apache as described above if load was above 7 and swapfile usage
above 150Mb. This script has kept the server online and we now have an
uptime of something like 22 days (previously no more than 1 day), but the
script is getting triggered several times a day and no more "Out of memory"
messages are appearing, but the situation is not ideal.

I have tried adding:

sub UNIVERSAL::AUTOLOAD {
my $class = shift;
Carp::cluck "$class can't \$UNIVERSAL::AUTOLOAD!\n";
}


As recommended by the developers guide, which flooded the error log with the
text below being printed roughly once a second in the error log:

-
Apache=SCALAR(0x830937c) can't $UNIVERSAL::AUTOLOAD!
Apache=SCALAR(0x8309364) can't $UNIVERSAL::AUTOLOAD!
DBI::DBI_tie=HASH(0x82dd16c) can't $UNIVERSAL::AUTOLOAD!
IO::Handle=IO(0x820aabc) can't $UNIVERSAL::AUTOLOAD!
DBI::DBI_tie=HASH(0x82dd16c) can't $UNIVERSAL::AUTOLOAD!
IO::Handle=IO(0x820aabc) can't $UNIVERSAL::AUTOLOAD!
DBI::DBI_tie=HASH(0x82dd16c) can't $UNIVERSAL::AUTOLOAD!
IO::Handle=IO(0x820aabc) can't $UNIVERSAL::AUTOLOAD!
DBI::DBI_tie=HASH(0x82dd16c) can't $UNIVERSAL::AUTOLOAD!
IO::Handle=IO(0x820aabc) can't $UNIVERSAL::AUTOLOAD!
--

I've pretty much exhausted any ways I can think of to trace this problem,
such as i've tried to eliminate memory leaks in code by removing some
scripts from mod_perl and running them under mod_cgi and i've tried tweaking
MaxRequestsPerChild both without any success.

One thing that was mentioned in a previous thread was that using 'exit'
could confuse perl, and exit() is used fairly heavily in the scripts since
most are converted to mod_perl from standard CGIs, but i'd prefer not to
have to remove these since the structure of the scripts is reliant on some
form of exit statement. Is there some alternative to exit()?

I've also had a look at some of the patches to Apache.pm and Apache.xs
suggested in the previous threads, and these seem to have been incorporated
into mod_perl 1.21.

Are there any other solutions I could try to this problem? Does anyone know
what might be causing this?

The second problem I have is when loading pages, usually CGI, but I think
this has happened on some static pages, what IE5 describes as "Server not
found or DNS error" is experienced. Originally I thought this was the server
hitting MaxClients (150) since it usually occurs at the same time as massive
surges of hits, and /server-status usually shows 150 httpd processes have
been spawned, however I increased MaxClients to 200 recently and the error
has continued to happen, even though /server-status doesn't show any more
than about 170 processes spawned. I have not ruled out DNS server troubles
or backbone problems (We've had a few routing troubles recently that slowed
things down, but not actually cut off traffic or anything like that), but I
am at a loss as to what else could be causing this so I thought i'd ask
whilst i'm on the subject of server problems :)

Thanks in advance,
--
James Furness <[EMAIL PROTECTED]>
ICQ #:  4663650