Forking w/ mod_perl 2
Hi all, I have a report creation perl script that takes about 15 minutes to run and I need to fork it. I tried the code from v1: use strict; use POSIX 'setsid'; use Apache::SubProcess; my = shift; -send_http_header(text/plain); {CHLD} = 'IGNORE'; defined (my = fork) or die Cannot fork: \n; if () { print Parent 25481 has finished, kid's PID: \n; } else { -cleanup_for_exec(); # untie the socket chdir '/'or die Can't chdir to /: ; open STDIN, '/dev/null' or die Can't read /dev/null: ; open STDOUT, '/dev/null' or die Can't write to /dev/null: ; open STDERR, '/tmp/log' or die Can't write to /tmp/log: ; setsid or die Can't start a new session: ; select STDERR; local $| = 1; warn started\n; # do something time-consuming sleep 1, warn sh\n for 1..20; warn completed\n; CORE::exit(0); # terminate the process } First problem, Apache::SubProcess doesn't seem to contain those methods anymore. Second problem is open. Can anyone tell me the proper way to fork with v2? Thanks, Cameron
Re: Forking w/ mod_perl 2
IMHO, it would be better to put your report code into another perl program and execute it. From what I see from your snippet of code, it's not important for the parent to know what the child is going, you are even ignoring SIGCHLD. Also, at some point in the future (I hope at least) mp2 + threaded mpm's will become more than alpha (although I already use it extensively but I'm crazy) and you might want to run you code in it. Forking under these circumstances would be a bad. 2c On Fri, 2003-09-12 at 14:40, Cameron B. Prince wrote: Hi all, I have a report creation perl script that takes about 15 minutes to run and I need to fork it. I tried the code from v1: use strict; use POSIX 'setsid'; use Apache::SubProcess; my = shift; -send_http_header(text/plain); {CHLD} = 'IGNORE'; defined (my = fork) or die Cannot fork: \n; if () { print Parent 25481 has finished, kid's PID: \n; } else { -cleanup_for_exec(); # untie the socket chdir '/'or die Can't chdir to /: ; open STDIN, '/dev/null' or die Can't read /dev/null: ; open STDOUT, '/dev/null' or die Can't write to /dev/null: ; open STDERR, '/tmp/log' or die Can't write to /tmp/log: ; setsid or die Can't start a new session: ; select STDERR; local $| = 1; warn started\n; # do something time-consuming sleep 1, warn sh\n for 1..20; warn completed\n; CORE::exit(0); # terminate the process } First problem, Apache::SubProcess doesn't seem to contain those methods anymore. Second problem is open. Can anyone tell me the proper way to fork with v2? Thanks, Cameron -- Richard F. Rebel [EMAIL PROTECTED] t. 212.239. signature.asc Description: This is a digitally signed message part
RE: Forking w/ mod_perl 2
Hi Richard, IMHO, it would be better to put your report code into another perl program and execute it. From what I see from your snippet of code, it's not important for the parent to know what the child is going, you are even ignoring SIGCHLD. Also, at some point in the future (I hope at least) mp2 + threaded mpm's will become more than alpha (although I already use it extensively but I'm crazy) and you might want to run you code in it. Forking under these circumstances would be a bad. Thanks for you reply. The report code is in another perl program that I'm trying to execute. The code I included below was from the v1 docs: http://perl.apache.org/docs/1.0/guide/performance.html#Forking_and_Executing _Subprocesses_from_mod_perl Here is the code I ended up with: $SIG{CHLD} = 'IGNORE'; defined (my $pid = fork) or die Cannot fork: $!\n; unless ($pid) { exec $command; CORE::exit(0); } This seems to work and no zombies are floating around. But I've not been able to restart Apache while the forked program is running yet to see if it's killed. More comments or ideas welcome. Thanks, Cameron 2c On Fri, 2003-09-12 at 14:40, Cameron B. Prince wrote: Hi all, I have a report creation perl script that takes about 15 minutes to run and I need to fork it. I tried the code from v1: use strict; use POSIX 'setsid'; use Apache::SubProcess; my = shift; -send_http_header(text/plain); {CHLD} = 'IGNORE'; defined (my = fork) or die Cannot fork: \n; if () { print Parent 25481 has finished, kid's PID: \n; } else { -cleanup_for_exec(); # untie the socket chdir '/'or die Can't chdir to /: ; open STDIN, '/dev/null' or die Can't read /dev/null: ; open STDOUT, '/dev/null' or die Can't write to /dev/null: ; open STDERR, '/tmp/log' or die Can't write to /tmp/log: ; setsid or die Can't start a new session: ; select STDERR; local $| = 1; warn started\n; # do something time-consuming sleep 1, warn sh\n for 1..20; warn completed\n; CORE::exit(0); # terminate the process } First problem, Apache::SubProcess doesn't seem to contain those methods anymore. Second problem is open. Can anyone tell me the proper way to fork with v2? Thanks, Cameron -- Richard F. Rebel [EMAIL PROTECTED] t. 212.239.
Re: Forking in mod_perl?
Hello, I have been working with this exact subject for the past couple weeks and becnhmarking lots of results. Although I dont have the results here to show, we have decided to use PVM ( http://www.epm.ornl.gov/pvm/pvm_home.html ) to spawn subprocces on other machines and also on the same machine. We do many very comptuational intensive tasks based on user input and we spawn these off to other machines, as well as the local machine to do the processing. Although we do not provide the results instantly, the user can come back or click on a link and see waht the satus of the process is doing. There is a perl module that we use in CPAN called Parallel::Pvm that interfaces with the PVM Virtual Machine software. We have tested it in doing heavy database work as well computationl intensive tasks, and the remote process reliability is excellent. It also has the capabilities of talking to other spawned processes on other machines if necessary. We are in a RH6.2 Linux environment (Works on FreeBSD also :), but the PVM software runs on many architectures. In a single server setup, you still get the abiltiy to spawn a process locally, so you will still get the benefit of off-loading the heavy process from mod_perl. Hope this helps. Bill -- Bill Desjardins - [EMAIL PROTECTED] - (USA) 305.205.8644 Unix/Network Administration - Perl/Mod_Perl/DB Development http://www.CarRacing.com - Powered by mod_perl! FREE WebHosting for Race Tracks, Race Teams and Race Shops On Wed, 4 Oct 2000, David E. Wheeler wrote: Hi All, Quick question - can I fork off a process in mod_perl? I've got a piece of code that needs to do a lot of processing that's unrelated to what shows up in the browser. So I'd like to be able to fork the processing off and return data to the browser, letting the forked process handle the extra processing at its leisure. Is this doable? Is forking a good idea in a mod_perl environment? Might there be another way to do it? TIA for the help! David
Re: Forking in mod_perl?
On Wed, Oct 04, 2000 at 02:42:50PM -0700, David E. Wheeler wrote: Yeah, I was thinking something along these lines. Don't know if I need something as complex as IPC. I was thinking of perhaps a second Apache server set up just to handle long-term processing. Then the first server could send a request to the second with the commands it needs to execute in a header. The second server processes those commands independantly of the first server, which then returns data to the browser. In a pinch, I'd just use something like a 'queue' directory. In other words, when your mod_perl code gets some info to process, it writes this into a file in a certain directory (name it with a timestamp / cksum to ensure the filename is unique). Every X seconds, have a It might be safer to do this in a db rather than the file system. That way there is less chance for colision and you don't have to worry about the file being half written when the daemon comes along and tries to read the file while mod_perl/apache is trying to write it. Let the DB do the storage side and let the damon do a select to gather the info.
Re: Forking in mod_perl? (benchmarking)
I'm working with the Swish search engine (www.apache.org and the guide use it). Until this month, SWISH could only be called via a fork/exec. Now there's an early C library for swish that I've built into a perl module for use with mod_perl. Yea! No forking! I decided to do some quick benchmarking with ab. I'm rotten at benchmarking anything, so any suggestions are welcome. My main question was this: With the library version you first call a routine to open the index files. This reads in header info and gets ready for the search. Then you run the query, and then you call a routine to close the index. OR, you can open the index file, and do multiple queries without opening and closing the index each time. Somewhat like caching a DBI connection, I suppose. So I wanted to see how much faster it is to keep the index file open. I decided to start Apache with only one child, so it would handle ALL the requests. I'm running ab on the same machine, and only doing 100 requests. Running my mod_perl program without asking for a query I can get almost 100 requests per second. That's just writing from memory and logging to an open file. Now comparing the two methods of calling SWISH I got about 7.7 request per second leaving the index file open between requests, and 6.5 per second opening each time. My guess is Linux is helping buffer the file contents quite a bit since this machine isn't doing anything else at the time, so there might be a wider gap if the machine was busy. Now, here's why this post is under this subject thread: For fun I changed over to forking Apache and exec'ing SWISH each request, and I got just over 6 requests per second. I guess I would have expected much worse, but again, I think Linux is helping out quite a bit in the fork. And for more fun, the "same" program under mod_cgi: 0.90 requests/second Bill Moseley mailto:[EMAIL PROTECTED]
Re: Forking in mod_perl?
On Thu, 5 Oct 2000, Sean D. Cook wrote: On Wed, Oct 04, 2000 at 02:42:50PM -0700, David E. Wheeler wrote: Yeah, I was thinking something along these lines. Don't know if I need something as complex as IPC. I was thinking of perhaps a second Apache server set up just to handle long-term processing. Then the first server could send a request to the second with the commands it needs to execute in a header. The second server processes those commands independantly of the first server, which then returns data to the browser. In a pinch, I'd just use something like a 'queue' directory. In other words, when your mod_perl code gets some info to process, it writes this into a file in a certain directory (name it with a timestamp / cksum to ensure the filename is unique). Every X seconds, have a It might be safer to do this in a db rather than the file system. That way there is less chance for colision and you don't have to worry about the file being half written when the daemon comes along and tries to read the file while mod_perl/apache is trying to write it. Let the DB do the storage side and let the damon do a select to gather the info. If you don't have a db easily available, I've had good luck using temp files. You can avoid partially written file errors by exploiting the atomic nature of moving (renaming) files. NFS does *not* have this nice behavior, however. -Tim
Forking in mod_perl?
Hi All, Quick question - can I fork off a process in mod_perl? I've got a piece of code that needs to do a lot of processing that's unrelated to what shows up in the browser. So I'd like to be able to fork the processing off and return data to the browser, letting the forked process handle the extra processing at its leisure. Is this doable? Is forking a good idea in a mod_perl environment? Might there be another way to do it? TIA for the help! David -- David E. Wheeler Software Engineer Salon Internet ICQ: 15726394 [EMAIL PROTECTED] AIM: dwTheory
RE: Forking in mod_perl?
-Original Message- From: David E. Wheeler [mailto:[EMAIL PROTECTED]] Sent: Wednesday, October 04, 2000 3:44 PM To: [EMAIL PROTECTED] Subject: Forking in mod_perl? Hi All, Quick question - can I fork off a process in mod_perl? I've got a piece of code that needs to do a lot of processing that's unrelated to what shows up in the browser. So I'd like to be able to fork the processing off and return data to the browser, letting the forked process handle the extra processing at its leisure. Is this doable? Is forking a good idea in a mod_perl environment? Might there be another way to do it? http://perl.apache.org/guide/performance.html#Forking_and_Executing_Subproce ss the cleanup phase is also a good place to do extended processing. It does tie up the child until the processing finishes, but it at least make the client think that the response is finished (so that little moving thingy in netscape stops moving around) HTH --Geoff TIA for the help! David -- David E. Wheeler Software Engineer Salon Internet ICQ: 15726394 [EMAIL PROTECTED] AIM: dwTheory
Re: Forking in mod_perl?
Hi David, Check out the guide at http://perl.apache.org/guide/performance.html#Forking_and_Executing_Subprocess The Eagle book also covers the C API subprocess details on page 622-631. Let us know if the guide is unclear to you, so we can improve it. Ed "David E. Wheeler" wrote: Hi All, Quick question - can I fork off a process in mod_perl? I've got a piece of code that needs to do a lot of processing that's unrelated to what shows up in the browser. So I'd like to be able to fork the processing off and return data to the browser, letting the forked process handle the extra processing at its leisure. Is this doable? Is forking a good idea in a mod_perl environment? Might there be another way to do it? TIA for the help! David -- David E. Wheeler Software Engineer Salon Internet ICQ: 15726394 [EMAIL PROTECTED] AIM: dwTheory
RE: Forking in mod_perl?
I was just going to post that url to the guide also... But another option I've come up with not listed in the guide is to use the *nix "at" command. If I need to run some processor intensive application that doesn't need apache_anything, I'll do a system call to "at" to schedule it to run (usually I pass in "now"). However, the drawbacks are that it's a complete seperate process and passing complicated structures isn't worth the time to think about using at. Jay On Wed, 4 Oct 2000, Geoffrey Young wrote: -Original Message- From: David E. Wheeler [mailto:[EMAIL PROTECTED]] Sent: Wednesday, October 04, 2000 3:44 PM To: [EMAIL PROTECTED] Subject: Forking in mod_perl? Hi All, Quick question - can I fork off a process in mod_perl? I've got a piece of code that needs to do a lot of processing that's unrelated to what shows up in the browser. So I'd like to be able to fork the processing off and return data to the browser, letting the forked process handle the extra processing at its leisure. Is this doable? Is forking a good idea in a mod_perl environment? Might there be another way to do it? http://perl.apache.org/guide/performance.html#Forking_and_Executing_Subproce ss the cleanup phase is also a good place to do extended processing. It does tie up the child until the processing finishes, but it at least make the client think that the response is finished (so that little moving thingy in netscape stops moving around) HTH --Geoff TIA for the help! David -- David E. Wheeler Software Engineer Salon Internet ICQ: 15726394 [EMAIL PROTECTED] AIM: dwTheory
Re: Forking in mod_perl?
ed phillips wrote: Hi David, Check out the guide at http://perl.apache.org/guide/performance.html#Forking_and_Executing_Subprocess The Eagle book also covers the C API subprocess details on page 622-631. Let us know if the guide is unclear to you, so we can improve it. Yeah, it's a bit unclear. If I understand correctly, it's suggesting that I do a system() call and have the perl script called detach itself from Apache, yes? I'm not too sure I like this approach. I was hoping for something a little more integrated. And how much overhead are we talking about getting taken up by this approach? Using the cleanup phase, as Geoffey Young suggests, might be a bit nicer, but I'll have to look into how much time my processing will likely take, hogging up an apache fork while it finishes. Either way, I'll have to think about various ways to handle this stuff, since I'm writing it into a regular Perl module that will then be called from mod_perl... Thanks, David
Re: Forking in mod_perl?
I hope it is clear that you don't want fork the whole server! Mod_cgi goes to great pains to effectively fork a subprocess, and was the major impetus I believe for the development of the C subprocess API. It (the source code for mod_cgi) is a great place to learn some of the subtleties as the Eagle book points out. As the Eagle book says, Apache is a complex beast. Mod_perl gives you the power to use the beast to your best advantage. Now you are faced with a trade off. Is it more expensive to detach a subprocess, or use the child cleanup phase to do some extra processing? I'd have to know more specifics to answer that with any modicum of confidence. Cheers, Ed "David E. Wheeler" wrote: ed phillips wrote: Hi David, Check out the guide at http://perl.apache.org/guide/performance.html#Forking_and_Executing_Subprocess The Eagle book also covers the C API subprocess details on page 622-631. Let us know if the guide is unclear to you, so we can improve it. Yeah, it's a bit unclear. If I understand correctly, it's suggesting that I do a system() call and have the perl script called detach itself from Apache, yes? I'm not too sure I like this approach. I was hoping for something a little more integrated. And how much overhead are we talking about getting taken up by this approach? Using the cleanup phase, as Geoffey Young suggests, might be a bit nicer, but I'll have to look into how much time my processing will likely take, hogging up an apache fork while it finishes. Either way, I'll have to think about various ways to handle this stuff, since I'm writing it into a regular Perl module that will then be called from mod_perl... Thanks, David
Re: Forking in mod_perl?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Wed, 4 Oct 2000, ed phillips wrote: Now you are faced with a trade off. Is it more expensive to detach a subprocess, or use the child cleanup phase to do some extra processing? I'd have to know more specifics to answer that with any modicum of confidence. He might try a daemon coprocesses using some IPC to communicate with Apache, which is my favorite way to do it.. - -- "The Funk, the whole Funk, and nothing but the Funk." Linux barcode software mirror: http://dadadada.net/cuecat Billy Donahue mailto:[EMAIL PROTECTED] -BEGIN PGP SIGNATURE- Version: GnuPG v1.0.3 (GNU/Linux) Comment: pgpenvelope 2.9.0 - http://pgpenvelope.sourceforge.net/ iD8DBQE525yz+2VvpwIZdF0RAjddAJ46Zxa4qHlLJuMfc1FHnS4aa7E7pwCfSFf8 MctjBHbwd8x31CAACVA98Ug= =B/EE -END PGP SIGNATURE-
Re: Forking in mod_perl?
ed phillips wrote: I hope it is clear that you don't want fork the whole server! Mod_cgi goes to great pains to effectively fork a subprocess, and was the major impetus I believe for the development of the C subprocess API. It (the source code for mod_cgi) is a great place to learn some of the subtleties as the Eagle book points out. As the Eagle book says, Apache is a complex beast. Mod_perl gives you the power to use the beast to your best advantage. Yeah, but I don't speak C. Just Perl. And it looks like the way to do it in Perl is to call system() and then detach the called script. I was trying to keep this all nice and tidy in modules, but I don't know if it'll be possible. Now you are faced with a trade off. Is it more expensive to detach a subprocess, or use the child cleanup phase to do some extra processing? I'd have to know more specifics to answer that with any modicum of confidence. I think I can probably evaluate that with a few tests. Thanks! David
Re: Forking in mod_perl?
Billy Donahue wrote: Now you are faced with a trade off. Is it more expensive to detach a subprocess, or use the child cleanup phase to do some extra processing? I'd have to know more specifics to answer that with any modicum of confidence. He might try a daemon coprocesses using some IPC to communicate with Apache, which is my favorite way to do it.. Yeah, I was thinking something along these lines. Don't know if I need something as complex as IPC. I was thinking of perhaps a second Apache server set up just to handle long-term processing. Then the first server could send a request to the second with the commands it needs to execute in a header. The second server processes those commands independantly of the first server, which then returns data to the browser. But maybe that's overkill. I'll have to weigh the heft of the post-request processing I need to do. Thanks for the suggestion! David
Re: Forking in mod_perl?
On Wed, Oct 04, 2000 at 02:42:50PM -0700, David E. Wheeler wrote: Yeah, I was thinking something along these lines. Don't know if I need something as complex as IPC. I was thinking of perhaps a second Apache server set up just to handle long-term processing. Then the first server could send a request to the second with the commands it needs to execute in a header. The second server processes those commands independantly of the first server, which then returns data to the browser. In a pinch, I'd just use something like a 'queue' directory. In other words, when your mod_perl code gets some info to process, it writes this into a file in a certain directory (name it with a timestamp / cksum to ensure the filename is unique). Every X seconds, have a daemon poll the directory; if it finds a file, it processes it. If not, it goes back to sleep for X seconds. I guess it's poor man's IPC. But it runs over NFS nicely, it's *very* simple, it's portable, and I've never needed anything more complex. You also don't need to fork the daemon or startup a new script every processing request. But if you need to do the processing in realtime, waiting up to X seconds for the results might be unacceptable. How does this sound? HTH, Neil -- Neil Conway [EMAIL PROTECTED] Get my GnuPG key from: http://klamath.dyndns.org/mykey.asc Encrypted mail welcomed It is dangerous to be right when the government is wrong. -- Voltaire PGP signature
Re: Forking in mod_perl?
I use a database table for the queue. No file locking issues, atomic transactions, you can sort and order the jobs, etc . . . you can wrap the entire "queue" library in a module. Plus, the background script that processes the queue can easily run with higher permissions, and you don't have to worry as much with setuid issues when forking from a parent process (like your apache) running as a user with less priviledges than what you (may) need. You can pass all the args you need to via a column in the db, and, if passing data back and forth is a must, serialize your data using Storable and have the queue runner thaw it back out. Very simple, very fast, very powerful. On Wed, 4 Oct 2000, Neil Conway wrote: On Wed, Oct 04, 2000 at 02:42:50PM -0700, David E. Wheeler wrote: Yeah, I was thinking something along these lines. Don't know if I need something as complex as IPC. I was thinking of perhaps a second Apache server set up just to handle long-term processing. Then the first server could send a request to the second with the commands it needs to execute in a header. The second server processes those commands independantly of the first server, which then returns data to the browser. In a pinch, I'd just use something like a 'queue' directory. In other words, when your mod_perl code gets some info to process, it writes this into a file in a certain directory (name it with a timestamp / cksum to ensure the filename is unique). Every X seconds, have a daemon poll the directory; if it finds a file, it processes it. If not, it goes back to sleep for X seconds. I guess it's poor man's IPC. But it runs over NFS nicely, it's *very* simple, it's portable, and I've never needed anything more complex. You also don't need to fork the daemon or startup a new script every processing request. But if you need to do the processing in realtime, waiting up to X seconds for the results might be unacceptable. How does this sound? HTH, Neil -- Neil Conway [EMAIL PROTECTED] Get my GnuPG key from: http://klamath.dyndns.org/mykey.asc Encrypted mail welcomed It is dangerous to be right when the government is wrong. -- Voltaire
Re: Forking in mod_perl?
David E. Wheeler writes: Using the cleanup phase, as Geoffey Young suggests, might be a bit nicer, but I'll have to look into how much time my processing will likely take, hogging up an apache fork while it finishes. I've wondered about this as well. I really like the cleanup handler, and thought that in general it would be better to tie up the httpd process and let apache decide when a new process is needed rather than always forking. For the most part I use the cleanup handlers to handle something that takes alot of time, but doesn't happen very often. If I had something that took alot of time every time someone hit a page I still don't think I'd fork, instead I'd pass off the information to another process and let that process run through the data asynchronously like a spooler... -- [EMAIL PROTECTED]