Re: Runaways

2001-02-05 Thread Perrin Harkins

Robert Landrum wrote:
 
 I have some very large httpd processes (35 MB) running our
 application software.  Every so often, one of the processes will grow
 infinitly large, consuming all available system resources.  After 300
 seconds the process dies (as specified in the config file), and the
 system usually returns to normal.  Is there any way to determine what
 is eating up all the memory?  I need to pinpoint this to a particular
 module.  I've tried coredumping during the incident, but gdb has yet
 to tell me anything useful.

First, BSD::Resource can save you from these.  It will do hard limits on
memory and CPU consumption.  Second, you may be bale to register a
handler for a signal that will generate a stack trace.  Look at
Devel::StackTrace (I think) for how to do it.
- Perrin



Re: Runaways

2001-02-05 Thread Dave Rolsky

On Mon, 5 Feb 2001, Perrin Harkins wrote:

 First, BSD::Resource can save you from these.  It will do hard limits on
 memory and CPU consumption.  Second, you may be bale to register a
 handler for a signal that will generate a stack trace.  Look at
 Devel::StackTrace (I think) for how to do it.

Nope, that's not it.  I wrote that one and it doesn't talk about that at
all.

-dave

/*==
www.urth.org
We await the New Sun
==*/




Re: Runaways

2001-02-05 Thread Perrin Harkins

Dave Rolsky wrote:
 
 On Mon, 5 Feb 2001, Perrin Harkins wrote:
 
  First, BSD::Resource can save you from these.  It will do hard limits on
  memory and CPU consumption.  Second, you may be bale to register a
  handler for a signal that will generate a stack trace.  Look at
  Devel::StackTrace (I think) for how to do it.
 
 Nope, that's not it.  I wrote that one and it doesn't talk about that at
 all.

I meant "for how to generate a stacktrace".  Using it with a singal
handler was demonstrated on this list about two weeks ago, but I can't
recall who did it.  It was someone trying to track down a segfault.
- Perrin



Re: Runaways

2001-02-05 Thread Dave Rolsky

On Mon, 5 Feb 2001, Perrin Harkins wrote:

  Nope, that's not it.  I wrote that one and it doesn't talk about that at
  all.

 I meant "for how to generate a stacktrace".  Using it with a singal
 handler was demonstrated on this list about two weeks ago, but I can't
 recall who did it.  It was someone trying to track down a segfault.

Oops, yes, you could use Devel::StackTrace or just use Carp::croak.

Devel::StackTrace gives you a finer grained access to the stack trace (you
can examine it frame by frame) rather than just dumping a string.  For the
purposes of dumping a trace to a log, either one will work just fine.


-dave

/*==
www.urth.org
We await the New Sun
==*/




Re: Runaways

2001-01-31 Thread Doug MacEachern

On Mon, 29 Jan 2001, Robert Landrum wrote:

 I have yet to solve the runaway problem, but I came up with a way of 
 identifying the URLS that are causing the problems.
 
 First, I added the following to a startup.pl script...
 
 $SIG{'USR2'} = \apache_runaway_handler;

setting that to \Carp::confess to get a stacktrace might be more useful.




Re: Runaways

2001-01-31 Thread Robert Landrum

Actually, I've had some bad experiences with the Carp module.  I was 
using Carp for all my errors and warnings within mod_perl on our 
development server, but when I moved it to our production server 
(both similarly configured) it cause every request to core dump.  I 
never figured out what the problem was, but removing the use Carp; 
and the calls to carp and croak stopped the core dumps.

Has anyone else had problems with the Carp module and mod_perl?

Robert Landrum

At 8:49 AM -0800 1/31/01, Doug MacEachern wrote:
On Mon, 29 Jan 2001, Robert Landrum wrote:

 I have yet to solve the runaway problem, but I came up with a way of
 identifying the URLS that are causing the problems.

 First, I added the following to a startup.pl script...

 $SIG{'USR2'} = \apache_runaway_handler;

setting that to \Carp::confess to get a stacktrace might be more useful.




Re: Runaways

2001-01-31 Thread Doug MacEachern

On Wed, 31 Jan 2001, Robert Landrum wrote:
 
 Has anyone else had problems with the Carp module and mod_perl?

there were bugs related to Carp in 5.6.0, fixed in 5.6.1-trial1,2




Re: Runaways

2001-01-30 Thread Vasily Petrushin

On Mon, 29 Jan 2001, Robert Landrum wrote:

 I have some very large httpd processes (35 MB) running our 

mod_perl are not freeing memory when httpd doing cleanup phase.


Me too :). 

Use the MaxRequestPerChild directive in httpd.conf.
After my investigations it seems to be only way to 
build a normal system. 

There are no 100% right worked ways, supplied with apache.
mod_status can provide you some info, but...

On Solaris 2.5.1, 7, 8 you can use /usr/proc/bin/pmap to 
build a map of the httpd process.


 application software.  Every so often, one of the processes will grow 
 infinitly large, consuming all available system resources.  After 300 
 seconds the process dies (as specified in the config file), and the 
 system usually returns to normal.  Is there any way to determine what 
 is eating up all the memory?  I need to pinpoint this to a particular 
 module.  I've tried coredumping during the incident, but gdb has yet 
 to tell me anything useful.
 
 I was actually playing around with the idea of hacking the perl 
 source so that it will change $0 to whatever the current package 
 name, but I don't know that this will translate back to mod perl 
 correctly, as $0 is the name of the configuration from within mod 
 perl.
 
 Has anyone had to deal with this sort of problem in the past?
 
 Robert Landrum
 

Vasily Petrushin
+7 (095) 2508363
http://www.interfax.ru
mailto:[EMAIL PROTECTED]




Re: Runaways

2001-01-29 Thread Steve Reppucci


Yes, I've seen this happen often, maybe once a day on a relatively heavily
used site running mod_perl, where a child process goes into a state where
it consumes lots of memory and cpu cycles.  I did some investigation, but
(like you, it sounds) couldn't garner any useful info from gdb traces.

I solved (?) this by writing a little perl script to run from cron
and watch for and kill these runaways, but it's an admittedly lame
solution.  I've meant for a while to look into Stas'
Apache::Watchdog::RunAway module to handle these more cleanly, but never
did get around to doing this.

Let us know if you do get to the bottom of this.

Steve

On Mon, 29 Jan 2001, Robert Landrum wrote:

 I have some very large httpd processes (35 MB) running our 
 application software.  Every so often, one of the processes will grow 
 infinitly large, consuming all available system resources.  After 300 
 seconds the process dies (as specified in the config file), and the 
 system usually returns to normal.  Is there any way to determine what 
 is eating up all the memory?  I need to pinpoint this to a particular 
 module.  I've tried coredumping during the incident, but gdb has yet 
 to tell me anything useful.
 
 I was actually playing around with the idea of hacking the perl 
 source so that it will change $0 to whatever the current package 
 name, but I don't know that this will translate back to mod perl 
 correctly, as $0 is the name of the configuration from within mod 
 perl.
 
 Has anyone had to deal with this sort of problem in the past?
 
 Robert Landrum
 

=-=-=-=-=-=-=-=-=-=-  My God!  What have I done?  -=-=-=-=-=-=-=-=-=-=
Steve Reppucci   [EMAIL PROTECTED] |
Logical Choice Software  http://logsoft.com/ |




Re: Runaways

2001-01-29 Thread Robert Landrum

I did the exact same thing... But the kill(-9,$pid) didn't work, even 
when run as root.  Unfortunatly, Apache::Watchdog::RunAway is just as 
lame as our solutions (Sorry Stas), in that it relies on an external 
process that checks the apache scoreboard and kills anything that's 
been running for "X" amount of time.

You could, in theory just reduce the "Timeout" option in apache to 
"X" above to achieve the same result, and avoid the external process 
altogether.

The problem, I'm afraid, is that I start hemorrhaging memory at the 
rate about 4 megs per second, and after 300 seconds, I have a process 
with just over 1200 megs of memory.  The machine itself handles this 
fine, but if the user stops and  does whatever it is they're doing 
again, I end up with two of those 1200 meg processes... which the 
machine cannot handle.

I'm hoping someone else has a more sophisticated solution to tracing 
runaway processes to their source.  If not, I'll have to write some 
internal stuff to do the job...

Robert Landrum

Yes, I've seen this happen often, maybe once a day on a relatively heavily
used site running mod_perl, where a child process goes into a state where
it consumes lots of memory and cpu cycles.  I did some investigation, but
(like you, it sounds) couldn't garner any useful info from gdb traces.

I solved (?) this by writing a little perl script to run from cron
and watch for and kill these runaways, but it's an admittedly lame
solution.  I've meant for a while to look into Stas'
Apache::Watchdog::RunAway module to handle these more cleanly, but never
did get around to doing this.

Let us know if you do get to the bottom of this.

Steve

On Mon, 29 Jan 2001, Robert Landrum wrote:

 I have some very large httpd processes (35 MB) running our
  application software.  Every so often, one of the processes will grow

...

  Has anyone had to deal with this sort of problem in the past?

 Robert Landrum


=-=-=-=-=-=-=-=-=-=-  My God!  What have I done?  -=-=-=-=-=-=-=-=-=-=
Steve Reppucci   [EMAIL PROTECTED] |
Logical Choice Software  http://logsoft.com/ |




Re: Runaways

2001-01-29 Thread Steve Reppucci

On Mon, 29 Jan 2001, Robert Landrum wrote:

 I did the exact same thing... But the kill(-9,$pid) didn't work, even 
 when run as root.  Unfortunatly, Apache::Watchdog::RunAway is just as 
 lame as our solutions (Sorry Stas), in that it relies on an external 
 process that checks the apache scoreboard and kills anything that's 
 been running for "X" amount of time.

Yep, we've had a few of these too -- but it seems I can avoid these if I
kill the runaways early enough before they become too brain dead.

 You could, in theory just reduce the "Timeout" option in apache to 
 "X" above to achieve the same result, and avoid the external process 
 altogether.

Hmmm, are you sure about that?  According to the apache manual:

   The TimeOut directive currently defines the amount of time Apache
   will wait for three things: 

   1.The total amount of time it takes to receive a GET request. 
   2.The amount of time between receipt of TCP packets on a POST
  or PUT request. 
   3.The amount of time between ACKs on transmissions of TCP packets
 in responses. 

I've never known 'Timeout' to affect the amount of time a child process
takes to service a request though...

 The problem, I'm afraid, is that I start hemorrhaging memory at the 
 rate about 4 megs per second, and after 300 seconds, I have a process 
 with just over 1200 megs of memory.  The machine itself handles this 
 fine, but if the user stops and  does whatever it is they're doing 
 again, I end up with two of those 1200 meg processes... which the 
 machine cannot handle.
 
 I'm hoping someone else has a more sophisticated solution to tracing 
 runaway processes to their source.  If not, I'll have to write some 
 internal stuff to do the job...

Afraid I can't offer anything better than what it sounds like you already
have...

Steve

=-=-=-=-=-=-=-=-=-=-  My God!  What have I done?  -=-=-=-=-=-=-=-=-=-=
Steve Reppucci   [EMAIL PROTECTED] |
Logical Choice Software  http://logsoft.com/ |




Re: Runaways

2001-01-29 Thread Robert Landrum

I have yet to solve the runaway problem, but I came up with a way of 
identifying the URLS that are causing the problems.

First, I added the following to a startup.pl script...

$SIG{'USR2'} = \apache_runaway_handler;
sub apache_runaway_handler {
 print RUNFILE "\%ENV contains:\n";
 open(RUNFILE,"/tmp/apache_runaway.$$");
 for (keys %ENV) {
 print RUNFILE "$_ = $ENV{$_}\n";
 }
 close(RUNFILE);
 exit(1);
}

Then I used a process monitor (via cron) to check the sizes of the 
httpd processes and to issue a system("kill -USR2 $pid") whenever 
that size reached a certain threshold (in this case 55MB).

What's dumped are the environment variables which contain the URL 
information so that the problem can (in theory) be reproduced.

Robert Landrum