Re: mod_perl memory

ARTHUR GOLDBERG Tue, 16 Mar 2010 12:04:14 -0700

Pavel

You're welcome. You are correct about the limitations ofApache2::SizeLimit. Processes cannot be 'scrubbed'; rather they shouldbe killed and restarted.

Rapid memory growth should be prevented by prohibiting processes fromever growing large than a preset limit. On Unix systems, the systemcall setrlimit sets process resource limits. These limits areinherited by children of the process. These limits can view and setwith the bash command rlimit. Many resources can be limited, but I'mfocusing on process size, which is controlled by resource RLIMIT_AS,the maximum size of a process's virtual memory (address space) inbytes. (Some operating systems control RLIMIT_DATA, The maximum sizeof the process's data segment, but Linux doesn't.)When a process tries to exceeds a resource limit, the system call thatrequested the resource fails and returns an error. The type of errordepends on which resource's limit is violated (see man page forsetrlimit). In the case of virtual memory, the RLIMIT_AS can beexceeded by any call that asks for additional virtual memory, such asbrk(2), which sets the end of the data segment. Perl manages memoryvia either the system's malloc or its own malloc. If asking forvirtual memory fails, then malloc will fail, which will typicallycause the Perl process to write "Out of Memory!" to STDERR and die.RLIMIT_AS can be set in many ways. One direct way an Apache/mod_perlprocess can set it is via Apache2::Resource. For example, thesecommands can be added to httpd.conf:


PerlModule Apache2::Resource
# set child memory limit to 100 megabytes
# RLIMIT_AS (address space) will work to limit the size of a process
PerlSetEnv PERL_RLIMIT_AS 100
PerlChildInitHandler Apache2::Resource

The PerlSetEnv line sets the Perl environment variable PERL_RLIMIT_AS.The PerlChildInitHandler line directs Apache to load Apache2::Resourceeach time it creates an httpd process. Apache2::Resource then readsPERL_RLIMIT_AS and sets the RLIMIT_AS limit to 100 (megabytes). Anyhttpd that tries to grow bigger than 100 MB will fail. (Also,PERL_RLIMIT_AS can be set to soft_limit:hard_limit, where soft_limitis the limit at which the resource request will fail. At any time thesoft_limit can be adjusted up to the hard_limit.)I recommend against setting this limit for a threaded process, becauseif one Request handler gets the process killed then all threadshandling requests will fail.When the process has failed it is difficult to output an error messageto the web user, because Perl calls die and the process exits.

As I wrote yesterday, failure of a mod_perl process with "Out ofMemory!", as occurs when the softlimit of RLIMIT_AS is exceeded, doesnot trigger an Apache ErrorDocument 500. A mod_perl process that exits(actually CORE::exit() must be called) doesn't trigger anErrorDocument 500 either.

Second, if Apache detects a server error it can redirect to a scriptas discussed in Custom Error Response. It can access the REDIRECTenvironment variables but doesn't know anything else about the HTTPRequest.

At this point I think that the best thing to do is useMaxRequestsPerChild and Apache2::SizeLimit to handle most memoryproblems, and simply let processes that blow up die without feedbackto users. Not ideal, but they should be extremely rare events.


BR
A

On Mar 16, 2010, at 2:31 PM, Pavel Georgiev wrote:

Thank you both for the quick replies!

Arthur,
Apache2::SizeLimit is no solution for my problem as I'm looking fora way to limit the size each requests take, the fact that I canscrub the process after the request is done (or drop the requests ifthe process reaches some limit, although my understanding is thatApache2::SizeLimit does its job after the requests is done) does nothelp me.
William,
Let me make I'm understanding this right - I'm not using any buffersmyself, all I do is sysread() from a unix socked and print(), itsjust that I need to print a large amount of data for each request.Are you saying that there is no way to free the memory after I'vedone print() and rflush()?
BTW thanks for the other suggestions, switching to cgi seems likethe only reasonable thing for me, I just want to make sure that thisis how mod_perl operates and it is not me who is doing somethingwrong.
Thanks,
Pavel

On Mar 16, 2010, at 11:18 AM, ARTHUR GOLDBERG wrote:
You could use Apache2::SizeLimit ("because size does matter") whichevaluates the size of Apache httpd processes when they completeHTTP Requests, and kills those that grow too large. (Note thatApache2::SizeLimit can only be used for non-threaded MPMs, such asprefork.) Since it operates at the end of a Request, SizeLimit hasthe advantage that it doesn't interrupt Request processing and thedisadvantage that it won't prevent a process from becomingoversized while processing a Request. To reduce the regular load ofApache2::SizeLimit it can be configured to check the sizeintermittently by setting the parameter CHECK_EVERY_N_REQUESTS.These parameters can be configured in a <Perl> section inhttpd.conf, or a Perl start-up file.
That way, if your script allocates too much memory the process willbe killed when it finishes handling the request. The MPM willeventually start another process if necessary.
BR
A

On Mar 16, 2010, at 9:30 AM, William T wrote:
On Mon, Mar 15, 2010 at 11:26 PM, Pavel Georgiev <pa...@3tera.com>wrote:
I have a perl script running in mod_perl that needs to write alarge amount of data to the client, possibly over a long period.The behavior that I observe is that once I print and flushsomething, the buffer memory is not reclaimed even though Irflush (I know this cant be reclaimed back by the OS).
Is that how mod_perl operates and is there a way that I can forceit to periodically free the buffer memory, so that I can use thatfor new buffers instead of taking more from the OS?
That is how Perl operates.  Mod_Perl is just Perl embedded in the
Apache Process.

You have a few options:
* Buy more memory. :)
* Delegate resource intensive work to a different process (I would
NOT suggest a forking a child in Apache).
* Tie the buffer to a file on disk, or db object, that can be
explicitly reclaimed
* Create a buffer object of a fixed size and loop.
* Use compression on the data stream that you read into a buffer.
You could also architect your system to mitigate resource usage ifthe
large data serve is not a common operation:
* Proxy those requests to a different server which is optimized to
handle large data serves.
* Execute the large data serves with CGI rather than Mod_Perl.

I'm sure there are probably other options as well.

-wjt
Arthur P. Goldberg, PhD

Research Scientist in Bioinformatics
Plant Systems Biology Laboratory
www.virtualplant.org

Visiting Academic
Computer Science Department
Courant Institute of Mathematical Sciences
www.cs.nyu.edu/artg

a...@cs.nyu.edu
New York University
212 995-4918
Coruzzi Lab
8th Floor Silver Building
1009 Silver Center
100 Washington Sq East
New York NY 10003-6688

Re: mod_perl memory

Reply via email to