Re: Runaways
Robert Landrum wrote: I have some very large httpd processes (35 MB) running our application software. Every so often, one of the processes will grow infinitly large, consuming all available system resources. After 300 seconds the process dies (as specified in the config file), and the system usually returns to normal. Is there any way to determine what is eating up all the memory? I need to pinpoint this to a particular module. I've tried coredumping during the incident, but gdb has yet to tell me anything useful. First, BSD::Resource can save you from these. It will do hard limits on memory and CPU consumption. Second, you may be bale to register a handler for a signal that will generate a stack trace. Look at Devel::StackTrace (I think) for how to do it. - Perrin
Re: Runaways
On Mon, 5 Feb 2001, Perrin Harkins wrote: First, BSD::Resource can save you from these. It will do hard limits on memory and CPU consumption. Second, you may be bale to register a handler for a signal that will generate a stack trace. Look at Devel::StackTrace (I think) for how to do it. Nope, that's not it. I wrote that one and it doesn't talk about that at all. -dave /*== www.urth.org We await the New Sun ==*/
Re: Runaways
Dave Rolsky wrote: On Mon, 5 Feb 2001, Perrin Harkins wrote: First, BSD::Resource can save you from these. It will do hard limits on memory and CPU consumption. Second, you may be bale to register a handler for a signal that will generate a stack trace. Look at Devel::StackTrace (I think) for how to do it. Nope, that's not it. I wrote that one and it doesn't talk about that at all. I meant "for how to generate a stacktrace". Using it with a singal handler was demonstrated on this list about two weeks ago, but I can't recall who did it. It was someone trying to track down a segfault. - Perrin
Re: Runaways
On Mon, 5 Feb 2001, Perrin Harkins wrote: Nope, that's not it. I wrote that one and it doesn't talk about that at all. I meant "for how to generate a stacktrace". Using it with a singal handler was demonstrated on this list about two weeks ago, but I can't recall who did it. It was someone trying to track down a segfault. Oops, yes, you could use Devel::StackTrace or just use Carp::croak. Devel::StackTrace gives you a finer grained access to the stack trace (you can examine it frame by frame) rather than just dumping a string. For the purposes of dumping a trace to a log, either one will work just fine. -dave /*== www.urth.org We await the New Sun ==*/
Re: Runaways
On Mon, 29 Jan 2001, Robert Landrum wrote: I have yet to solve the runaway problem, but I came up with a way of identifying the URLS that are causing the problems. First, I added the following to a startup.pl script... $SIG{'USR2'} = \apache_runaway_handler; setting that to \Carp::confess to get a stacktrace might be more useful.
Re: Runaways
Actually, I've had some bad experiences with the Carp module. I was using Carp for all my errors and warnings within mod_perl on our development server, but when I moved it to our production server (both similarly configured) it cause every request to core dump. I never figured out what the problem was, but removing the use Carp; and the calls to carp and croak stopped the core dumps. Has anyone else had problems with the Carp module and mod_perl? Robert Landrum At 8:49 AM -0800 1/31/01, Doug MacEachern wrote: On Mon, 29 Jan 2001, Robert Landrum wrote: I have yet to solve the runaway problem, but I came up with a way of identifying the URLS that are causing the problems. First, I added the following to a startup.pl script... $SIG{'USR2'} = \apache_runaway_handler; setting that to \Carp::confess to get a stacktrace might be more useful.
Re: Runaways
On Wed, 31 Jan 2001, Robert Landrum wrote: Has anyone else had problems with the Carp module and mod_perl? there were bugs related to Carp in 5.6.0, fixed in 5.6.1-trial1,2
Re: Runaways
On Mon, 29 Jan 2001, Robert Landrum wrote: I have some very large httpd processes (35 MB) running our mod_perl are not freeing memory when httpd doing cleanup phase. Me too :). Use the MaxRequestPerChild directive in httpd.conf. After my investigations it seems to be only way to build a normal system. There are no 100% right worked ways, supplied with apache. mod_status can provide you some info, but... On Solaris 2.5.1, 7, 8 you can use /usr/proc/bin/pmap to build a map of the httpd process. application software. Every so often, one of the processes will grow infinitly large, consuming all available system resources. After 300 seconds the process dies (as specified in the config file), and the system usually returns to normal. Is there any way to determine what is eating up all the memory? I need to pinpoint this to a particular module. I've tried coredumping during the incident, but gdb has yet to tell me anything useful. I was actually playing around with the idea of hacking the perl source so that it will change $0 to whatever the current package name, but I don't know that this will translate back to mod perl correctly, as $0 is the name of the configuration from within mod perl. Has anyone had to deal with this sort of problem in the past? Robert Landrum Vasily Petrushin +7 (095) 2508363 http://www.interfax.ru mailto:[EMAIL PROTECTED]
Re: Runaways
Yes, I've seen this happen often, maybe once a day on a relatively heavily used site running mod_perl, where a child process goes into a state where it consumes lots of memory and cpu cycles. I did some investigation, but (like you, it sounds) couldn't garner any useful info from gdb traces. I solved (?) this by writing a little perl script to run from cron and watch for and kill these runaways, but it's an admittedly lame solution. I've meant for a while to look into Stas' Apache::Watchdog::RunAway module to handle these more cleanly, but never did get around to doing this. Let us know if you do get to the bottom of this. Steve On Mon, 29 Jan 2001, Robert Landrum wrote: I have some very large httpd processes (35 MB) running our application software. Every so often, one of the processes will grow infinitly large, consuming all available system resources. After 300 seconds the process dies (as specified in the config file), and the system usually returns to normal. Is there any way to determine what is eating up all the memory? I need to pinpoint this to a particular module. I've tried coredumping during the incident, but gdb has yet to tell me anything useful. I was actually playing around with the idea of hacking the perl source so that it will change $0 to whatever the current package name, but I don't know that this will translate back to mod perl correctly, as $0 is the name of the configuration from within mod perl. Has anyone had to deal with this sort of problem in the past? Robert Landrum =-=-=-=-=-=-=-=-=-=- My God! What have I done? -=-=-=-=-=-=-=-=-=-= Steve Reppucci [EMAIL PROTECTED] | Logical Choice Software http://logsoft.com/ |
Re: Runaways
I did the exact same thing... But the kill(-9,$pid) didn't work, even when run as root. Unfortunatly, Apache::Watchdog::RunAway is just as lame as our solutions (Sorry Stas), in that it relies on an external process that checks the apache scoreboard and kills anything that's been running for "X" amount of time. You could, in theory just reduce the "Timeout" option in apache to "X" above to achieve the same result, and avoid the external process altogether. The problem, I'm afraid, is that I start hemorrhaging memory at the rate about 4 megs per second, and after 300 seconds, I have a process with just over 1200 megs of memory. The machine itself handles this fine, but if the user stops and does whatever it is they're doing again, I end up with two of those 1200 meg processes... which the machine cannot handle. I'm hoping someone else has a more sophisticated solution to tracing runaway processes to their source. If not, I'll have to write some internal stuff to do the job... Robert Landrum Yes, I've seen this happen often, maybe once a day on a relatively heavily used site running mod_perl, where a child process goes into a state where it consumes lots of memory and cpu cycles. I did some investigation, but (like you, it sounds) couldn't garner any useful info from gdb traces. I solved (?) this by writing a little perl script to run from cron and watch for and kill these runaways, but it's an admittedly lame solution. I've meant for a while to look into Stas' Apache::Watchdog::RunAway module to handle these more cleanly, but never did get around to doing this. Let us know if you do get to the bottom of this. Steve On Mon, 29 Jan 2001, Robert Landrum wrote: I have some very large httpd processes (35 MB) running our application software. Every so often, one of the processes will grow ... Has anyone had to deal with this sort of problem in the past? Robert Landrum =-=-=-=-=-=-=-=-=-=- My God! What have I done? -=-=-=-=-=-=-=-=-=-= Steve Reppucci [EMAIL PROTECTED] | Logical Choice Software http://logsoft.com/ |
Re: Runaways
On Mon, 29 Jan 2001, Robert Landrum wrote: I did the exact same thing... But the kill(-9,$pid) didn't work, even when run as root. Unfortunatly, Apache::Watchdog::RunAway is just as lame as our solutions (Sorry Stas), in that it relies on an external process that checks the apache scoreboard and kills anything that's been running for "X" amount of time. Yep, we've had a few of these too -- but it seems I can avoid these if I kill the runaways early enough before they become too brain dead. You could, in theory just reduce the "Timeout" option in apache to "X" above to achieve the same result, and avoid the external process altogether. Hmmm, are you sure about that? According to the apache manual: The TimeOut directive currently defines the amount of time Apache will wait for three things: 1.The total amount of time it takes to receive a GET request. 2.The amount of time between receipt of TCP packets on a POST or PUT request. 3.The amount of time between ACKs on transmissions of TCP packets in responses. I've never known 'Timeout' to affect the amount of time a child process takes to service a request though... The problem, I'm afraid, is that I start hemorrhaging memory at the rate about 4 megs per second, and after 300 seconds, I have a process with just over 1200 megs of memory. The machine itself handles this fine, but if the user stops and does whatever it is they're doing again, I end up with two of those 1200 meg processes... which the machine cannot handle. I'm hoping someone else has a more sophisticated solution to tracing runaway processes to their source. If not, I'll have to write some internal stuff to do the job... Afraid I can't offer anything better than what it sounds like you already have... Steve =-=-=-=-=-=-=-=-=-=- My God! What have I done? -=-=-=-=-=-=-=-=-=-= Steve Reppucci [EMAIL PROTECTED] | Logical Choice Software http://logsoft.com/ |
Re: Runaways
I have yet to solve the runaway problem, but I came up with a way of identifying the URLS that are causing the problems. First, I added the following to a startup.pl script... $SIG{'USR2'} = \apache_runaway_handler; sub apache_runaway_handler { print RUNFILE "\%ENV contains:\n"; open(RUNFILE,"/tmp/apache_runaway.$$"); for (keys %ENV) { print RUNFILE "$_ = $ENV{$_}\n"; } close(RUNFILE); exit(1); } Then I used a process monitor (via cron) to check the sizes of the httpd processes and to issue a system("kill -USR2 $pid") whenever that size reached a certain threshold (in this case 55MB). What's dumped are the environment variables which contain the URL information so that the problem can (in theory) be reproduced. Robert Landrum