Shared memory 'copy-on-write' issue
Hi all, reading http://perl.apache.org/docs/1.0/api/Apache/SizeLimit.html#Shared_Memory_Opti ons i am seeing that link about memory sharing by copy-on-write points to http://perl.apache.org/docs/1.0/guide/index.html and 'META: change link when site is live' stands after it. Site is alive, how knows where should this link point to? Thanks, Anton Permaykov.
Re: Shared memory 'copy-on-write' issue
Anton Permyakov wrote: reading http://perl.apache.org/docs/1.0/api/Apache/SizeLimit.html#Shared_Memory_Opti ons i am seeing that link about memory sharing by copy-on-write points to http://perl.apache.org/docs/1.0/guide/index.html and 'META: change link when site is live' stands after it. Site is alive, how knows where should this link point to? It does go the guide correctly, but a more specific link would be this: http://perl.apache.org/docs/1.0/guide/performance.html#Sharing_Memory - Perrin
Re: Shared memory 'copy-on-write' issue
Perrin Harkins wrote: Anton Permyakov wrote: reading http://perl.apache.org/docs/1.0/api/Apache/SizeLimit.html#Shared_Memory_Opti ons i am seeing that link about memory sharing by copy-on-write points to http://perl.apache.org/docs/1.0/guide/index.html and 'META: change link when site is live' stands after it. Site is alive, how knows where should this link point to? It does go the guide correctly, but a more specific link would be this: http://perl.apache.org/docs/1.0/guide/performance.html#Sharing_Memory I've fixed the link to point to the above URL. Thanks for pointing this out. __ Stas BekmanJAm_pH -- Just Another mod_perl Hacker http://stason.org/ mod_perl Guide --- http://perl.apache.org mailto:[EMAIL PROTECTED] http://use.perl.org http://apacheweek.com http://modperlbook.org http://apache.org http://ticketmaster.com
Shared memory
Hello! :) After moving to RedHat 7.3 with kernel 2.4.18-3smp system can't use shared memory: --- CPU0 states: 26.1% user, 13.0% system, 0.0% nice, 59.0% idle CPU1 states: 24.0% user, 10.1% system, 0.0% nice, 64.0% idle Mem: 1030724K av, 953088K used, 77636K free, 0K shrd, 27856K buff Swap: 1052248K av, 0K used, 1052248K free 619432K cached --- Can anybody get me info about this problem Thank you in advnace. Kosyo __ Do you Yahoo!? Yahoo! Mail Plus - Powerful. Affordable. Sign up now. http://mailplus.yahoo.com
Re: loss of shared memory in parent httpd (2)
Eric Frazier wrote: Hi, This may be totaly ignorate crap, but I noticed this when I was reading the ps man page on BSD 4.5 about sys/proc.h flags This one I noticed.. P_SYSTEM 0x00200System proc: no sigs, stats or swapping Could this mean what I think it means? That a process with this flag set, won't be swaped out?? I've spent some time with our friend google and here is what I came up with (it's hard to search for things which can be called in many different ways): I've searched for P_SYSTEM and it seems that it's a *BSD thing and also when you set it you don't get sigs delivered. e.g. see the discussion here: http://mail-index.netbsd.org/tech-kern/1998/09/08/0004.html There is also: madvise(2) - give advice about use of memory Has anybody tried to use it? Can this help? There is some discussion here: http://lists.insecure.org/linux-kernel/2001/Oct/0877.html Here is another observation and explanation of the swapping/mem unsharing phenomena on linux. http://www.uwsg.iu.edu/hypermail/linux/kernel/0110.3/0324.html http://www.uwsg.iu.edu/hypermail/linux/kernel/0110.3/0307.html Finally, apparently it's relatively easy to patch the linux kernel to enabling mlock for non-root processes: http://www.uwsg.iu.edu/hypermail/linux/kernel/9608.2/0280.html At 03:55 PM 3/12/02 +0100, Elizabeth Mattijsen wrote: Oops. Premature sending... I have two ideas that might help: - reduce number of global variables used, less memory pollution by lexicals - make sure that you have the most up-to-date (kernel) version of your OS. Newer Linux kernels seem to be a lot savvy at handling shared memory than older kernels. Again, I wish you strength in fixing this problem... Elizabeth Mattijsen http://www.kwinternet.com/eric (250) 655 - 9513 (PST Time Zone) -- _ Stas Bekman JAm_pH -- Just Another mod_perl Hacker http://stason.org/ mod_perl Guide http://perl.apache.org/guide mailto:[EMAIL PROTECTED] http://ticketmaster.com http://apacheweek.com http://singlesheaven.com http://perl.apache.org http://perlmonth.com/
RE: loss of shared memory in parent httpd
Hi all, On Sat, 16 Mar 2002, Bill Marrs wrote: leads ones to wonder if some of our assumptions or tools used to monitor memory are inaccurate or we're misinterpreting them. Well 'top' on Linux is rubbish for sure. 73, Ged.
RE: loss of shared memory in parent httpd
Hi, I had hoped that FreeBSD would be immune, but it seems not. I have been bashing it with http_load and all of a sudden(after a LOT of bashing and swaping) all of my processes had zero shared. It did take me days of fiddling to run into this though. Thanks, Eric At 04:16 PM 3/16/02 -0500, Ed Grimm wrote: I believe I have the answer... The problem is that the parent httpd swaps, and any new children it creates load the portion of memory that was swaped from swap, which does not make it copy-on-write. The really annoying thing - when memory gets tight, the parent is the most likely httpd process to swap, because its memory is 99% idle. This issue aflicts Linux, Solaris, and a bunch of other OSes. The solution is mlockall(2), available under Linux, Solaris, and other POSIX.1b compliant OSes. I've not experimented with calling it from perl, and I've not looked at Apache enough to consider patching it there, but this system call, if your process is run as root, will prevent any and all swapping of your process's memory. If your process is not run as root, it returns an error. The reason turning off swap works is because it forces the memory from the parent process that was swapped out to be swapped back in. It will not fix those processes that have been sired after the shared memory loss, as of Linux 2.2.15 and Solaris 2.6. (I have not checked since then for behavior in this regard, nor have I checked on other OSes.) Ed On Thu, 14 Mar 2002, Bill Marrs wrote: It's copy-on-write. The swap is a write-to-disk. There's no such thing as sharing memory between one process on disk(/swap) and another in memory. agreed. What's interesting is that if I turn swap off and back on again, the sharing is restored! So, now I'm tempted to run a crontab every 30 minutes that turns the swap off and on again, just to keep the httpds shared. No Apache restart required! Seems like a crazy thing to do, though. You'll also want to look into tuning your paging algorithm. Yeah... I'll look into it. If I had a way to tell the kernel to never swap out any httpd process, that would be a great solution. The kernel is making a bad choice here. By swapping, it triggers more memory usage because sharing removed on the httpd process group (thus multiplied)... I've got MaxClients down to 8 now and it's still happening. I think my best course of action may be a crontab swap flusher. -bill http://www.kwinternet.com/eric (250) 655 - 9513 (PST Time Zone)
Re: loss of shared memory in parent httpd (2)
Hi, This may be totaly ignorate crap, but I noticed this when I was reading the ps man page on BSD 4.5 about sys/proc.h flags This one I noticed.. P_SYSTEM 0x00200System proc: no sigs, stats or swapping Could this mean what I think it means? That a process with this flag set, won't be swaped out?? Thanks, Eric At 03:55 PM 3/12/02 +0100, Elizabeth Mattijsen wrote: Oops. Premature sending... I have two ideas that might help: - reduce number of global variables used, less memory pollution by lexicals - make sure that you have the most up-to-date (kernel) version of your OS. Newer Linux kernels seem to be a lot savvy at handling shared memory than older kernels. Again, I wish you strength in fixing this problem... Elizabeth Mattijsen http://www.kwinternet.com/eric (250) 655 - 9513 (PST Time Zone)
RE: loss of shared memory in parent httpd
I believe I have the answer... The problem is that the parent httpd swaps, and any new children it creates load the portion of memory that was swaped from swap, which does not make it copy-on-write. The really annoying thing - when memory gets tight, the parent is the most likely httpd process to swap, because its memory is 99% idle. This issue aflicts Linux, Solaris, and a bunch of other OSes. The solution is mlockall(2), available under Linux, Solaris, and other POSIX.1b compliant OSes. I've not experimented with calling it from perl, and I've not looked at Apache enough to consider patching it there, but this system call, if your process is run as root, will prevent any and all swapping of your process's memory. If your process is not run as root, it returns an error. The reason turning off swap works is because it forces the memory from the parent process that was swapped out to be swapped back in. It will not fix those processes that have been sired after the shared memory loss, as of Linux 2.2.15 and Solaris 2.6. (I have not checked since then for behavior in this regard, nor have I checked on other OSes.) Ed On Thu, 14 Mar 2002, Bill Marrs wrote: It's copy-on-write. The swap is a write-to-disk. There's no such thing as sharing memory between one process on disk(/swap) and another in memory. agreed. What's interesting is that if I turn swap off and back on again, the sharing is restored! So, now I'm tempted to run a crontab every 30 minutes that turns the swap off and on again, just to keep the httpds shared. No Apache restart required! Seems like a crazy thing to do, though. You'll also want to look into tuning your paging algorithm. Yeah... I'll look into it. If I had a way to tell the kernel to never swap out any httpd process, that would be a great solution. The kernel is making a bad choice here. By swapping, it triggers more memory usage because sharing removed on the httpd process group (thus multiplied)... I've got MaxClients down to 8 now and it's still happening. I think my best course of action may be a crontab swap flusher. -bill
Re: loss of shared memory in parent httpd
Yes, this is my theory also. I figured this out a while back, and started a thread on this list, but since then haven't had enough time to investigate it further. The thread is here: http://mathforum.org/epigone/modperl/wherdtharvoi which includes some helpful hints from Doug on how to call mlockall() from the mod_perl parent process. HTH.. I'm very curious to know if this works. -Adi Ed Grimm wrote: I believe I have the answer... The problem is that the parent httpd swaps, and any new children it creates load the portion of memory that was swaped from swap, which does not make it copy-on-write. The really annoying thing - when memory gets tight, the parent is the most likely httpd process to swap, because its memory is 99% idle. This issue aflicts Linux, Solaris, and a bunch of other OSes. The solution is mlockall(2), available under Linux, Solaris, and other POSIX.1b compliant OSes. I've not experimented with calling it from perl, and I've not looked at Apache enough to consider patching it there, but this system call, if your process is run as root, will prevent any and all swapping of your process's memory. If your process is not run as root, it returns an error. The reason turning off swap works is because it forces the memory from the parent process that was swapped out to be swapped back in. It will not fix those processes that have been sired after the shared memory loss, as of Linux 2.2.15 and Solaris 2.6. (I have not checked since then for behavior in this regard, nor have I checked on other OSes.) Ed On Thu, 14 Mar 2002, Bill Marrs wrote: It's copy-on-write. The swap is a write-to-disk. There's no such thing as sharing memory between one process on disk(/swap) and another in memory. agreed. What's interesting is that if I turn swap off and back on again, the sharing is restored! So, now I'm tempted to run a crontab every 30 minutes that turns the swap off and on again, just to keep the httpds shared. No Apache restart required! Seems like a crazy thing to do, though. You'll also want to look into tuning your paging algorithm. Yeah... I'll look into it. If I had a way to tell the kernel to never swap out any httpd process, that would be a great solution. The kernel is making a bad choice here. By swapping, it triggers more memory usage because sharing removed on the httpd process group (thus multiplied)... I've got MaxClients down to 8 now and it's still happening. I think my best course of action may be a crontab swap flusher. -bill
RE: loss of shared memory in parent httpd
The reason turning off swap works is because it forces the memory from the parent process that was swapped out to be swapped back in. It will not fix those processes that have been sired after the shared memory loss, as of Linux 2.2.15 and Solaris 2.6. (I have not checked since then for behavior in this regard, nor have I checked on other OSes.) In my case, I'm using Linux 2.4.17, when I turn off swap and turn it back on again, it restores the shared memory of both the parent and the children Apache processes. This seems counter-intuitive, as it would seem the kernel memory manager would have to bend over backwards to accomplish this re-binding of the swapped-out shared memory pages. Thus, it leads ones to wonder if some of our assumptions or tools used to monitor memory are inaccurate or we're misinterpreting them. -bill
RE: loss of shared memory in parent httpd
It's copy-on-write. The swap is a write-to-disk. There's no such thing as sharing memory between one process on disk(/swap) and another in memory. agreed. What's interesting is that if I turn swap off and back on again, the sharing is restored! So, now I'm tempted to run a crontab every 30 minutes that turns the swap off and on again, just to keep the httpds shared. No Apache restart required! Seems like a crazy thing to do, though. You'll also want to look into tuning your paging algorithm. Yeah... I'll look into it. If I had a way to tell the kernel to never swap out any httpd process, that would be a great solution. The kernel is making a bad choice here. By swapping, it triggers more memory usage because sharing removed on the httpd process group (thus multiplied)... I've got MaxClients down to 8 now and it's still happening. I think my best course of action may be a crontab swap flusher. -bill
Re: loss of shared memory in parent httpd
On Thu, 14 Mar 2002 07:25:27 -0500, Bill Marrs [EMAIL PROTECTED] said: It's copy-on-write. The swap is a write-to-disk. There's no such thing as sharing memory between one process on disk(/swap) and another in memory. agreed. What's interesting is that if I turn swap off and back on again, the sharing is restored! So, now I'm tempted to run a crontab every 30 minutes that turns the swap off and on again, just to keep the httpds shared. No Apache restart required! Funny, I'm doing this for ages and I never really knew why, you just found the reason, Thank You! My concerns were similar to yours but on a smaller scale, so I did not worry that much, but I'm running a swapflusher regularly. Make sure you have a recent kernel, because all old kernels up to 2.4.12 or so were extremely unresponsive during swapoff. With current kernels, this is much, much faster and nothing to worry about. Let me show you the script I use for the job. No rocket science, but it's easy to do it wrong. Be careful to maintain equality of priority among disks: use strict; $|=1; print Running swapon -a, just in case...\n; system swapon -a; print Running swapon -s\n; open S, swapon -s |; my(%prio); PARTITION: while (S) { print; next if /^Filename/; chop; my($f,$t,$s,$used,$p) = split; my $disk = $f; $disk =~ s/\d+$//; $prio{$disk} ||= 5; $prio{$disk}--; if ($used == 0) { print Unused, skipping\n; next PARTITION; } print Turning off\n; system swapoff $f; print Turning on with priority $prio{$disk}\n; system swapon -p $prio{$disk} $f; } system swapon -s; Let me know if you see room for improvements, Regards, -- andreas
[OT]RE: loss of shared memory in parent httpd
Call me an idiot. How is it even remotely possible that turning off swap restores memory shared between processes? Is the Linux kernel going from process to process comparing pages of memory as they re-enter RAM? Oh, those two look identical, they'll get shared? -Incredulous -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Sent: Thursday, March 14, 2002 8:24 AM To: Bill Marrs Cc: [EMAIL PROTECTED] Subject: Re: loss of shared memory in parent httpd On Thu, 14 Mar 2002 07:25:27 -0500, Bill Marrs [EMAIL PROTECTED] said: It's copy-on-write. The swap is a write-to-disk. There's no such thing as sharing memory between one process on disk(/swap) and another in memory. agreed. What's interesting is that if I turn swap off and back on again, the sharing is restored! So, now I'm tempted to run a crontab every 30 minutes that turns the swap off and on again, just to keep the httpds shared. No Apache restart required! Funny, I'm doing this for ages and I never really knew why, you just found the reason, Thank You! My concerns were similar to yours but on a smaller scale, so I did not worry that much, but I'm running a swapflusher regularly. Make sure you have a recent kernel, because all old kernels up to 2.4.12 or so were extremely unresponsive during swapoff. With current kernels, this is much, much faster and nothing to worry about. Let me show you the script I use for the job. No rocket science, but it's easy to do it wrong. Be careful to maintain equality of priority among disks: use strict; $|=1; print Running swapon -a, just in case...\n; system swapon -a; print Running swapon -s\n; open S, swapon -s |; my(%prio); PARTITION: while (S) { print; next if /^Filename/; chop; my($f,$t,$s,$used,$p) = split; my $disk = $f; $disk =~ s/\d+$//; $prio{$disk} ||= 5; $prio{$disk}--; if ($used == 0) { print Unused, skipping\n; next PARTITION; } print Turning off\n; system swapoff $f; print Turning on with priority $prio{$disk}\n; system swapon -p $prio{$disk} $f; } system swapon -s; Let me know if you see room for improvements, Regards, -- andreas -- This message is intended only for the personal and confidential use of the designated recipient(s) named above. If you are not the intended recipient of this message you are hereby notified that any review, dissemination, distribution or copying of this message is strictly prohibited. This communication is for information purposes only and should not be regarded as an offer to sell or as a solicitation of an offer to buy any financial product, an official confirmation of any transaction, or as an official statement of Lehman Brothers. Email transmission cannot be guaranteed to be secure or error-free. Therefore, we do not represent that this information is complete or accurate and it should not be relied upon as such. All information is subject to change without notice.
Re: [OT]RE: loss of shared memory in parent httpd
How is it even remotely possible that turning off swap restores memory shared between processes? Is the Linux kernel going from process to process comparing pages of memory as they re-enter RAM? Oh, those two look identical, they'll get shared? This is a good point. I really have no clue how the kernel deals with swapping/sharing, so I can only speculate. I could imagine that it's possible for it to do this, if the pages are marked properly, they could be restored. But, I'll admit, it seems unlikely. ...and, I had this thought before. Maybe this apparent loss of shared memory is an illusion. It appears to make the amount of memory that the httpds use grow very high, but perhaps it is a kind of shared-swap, and thus the calculation I'm using to determine overall memory usage would need to also factor out swap. ...in which case, there's no problem at all. But, I do see an albeit qualitative performance increase and CPU load lowering when I get the httpds to stay shared (and unswapped). So, I think it does matter. Though, if you think about it, it sort of makes sense. Some portion of the shared part of the httpd is also not being used much, so it gets swapped out to disk. But, if those pages really aren't being used, then there shouldn't be a performance hit. If they are being used, then they'd get swapped back in. ...which sort of disproves my qualitative reasoning that swap/unshared is bad. my head hurts, maybe I should join a kernel mailing list and see is someone there can help me (and if I can understand them). -bill
RE: loss of shared memory in parent httpd
On Thu, 14 Mar 2002, Bill Marrs wrote: It's copy-on-write. The swap is a write-to-disk. There's no such thing as sharing memory between one process on disk(/swap) and another in memory. agreed. What's interesting is that if I turn swap off and back on again, what? doesn't seem to me like you are agreeing, and the original quote doesn't make sense either (because a shared page is a shared page, it can only be in one spot until/unless it gets copied). a shared page is swapped to disk. It then gets swapped back in, but for some reason the kernel seems to treat swapping a page back in as copying the page which doesn't seem logical ... anyone here got a more direct line with someone like Alan Cox? That is, _unless_ you copy all the swap space back in (e.g. swapoff)..., but that is probably a very different operation than demand paging. the sharing is restored! So, now I'm tempted to run a crontab every 30 minutes that turns the swap off and on again, just to keep the httpds shared. No Apache restart required! Seems like a crazy thing to do, though. You'll also want to look into tuning your paging algorithm. Yeah... I'll look into it. If I had a way to tell the kernel to never swap out any httpd process, that would be a great solution. The kernel is making a bad choice here. By swapping, it triggers more memory usage because sharing removed on the httpd process group (thus multiplied)... the kernel doesn't want to swap out data in any case... if it does, it means memory pressure is reasonably high. AFAIK the kernel would far rather drop executable code pages which it can just go re-read ... I've got MaxClients down to 8 now and it's still happening. I think my best course of action may be a crontab swap flusher. or reduce MaxRequestsPerChild ? Stas also has some tools for causing children to exit early if their memory usage goes above some limit. I'm sure it's in the guide. -bill -- [EMAIL PROTECTED] | Courage is doing what you're afraid to do. http://BareMetal.com/ | There can be no courage unless you're scared. | - Eddie Rickenbacker
Re: [OT]RE: loss of shared memory in parent httpd
Bill Marrs wrote: You actually can do this. See the mergemem project: http://www.complang.tuwien.ac.at/ulrich/mergemem/ I'm interested in this, but it involves a kernel hack and the latest version is from 29-Jan-1999, so I got cold feet. It was a student project. And unless someone tells me differently wasn't picked up by community. In any case I've mentioned this as a proof of concept. Of course I'd love to see a working tool too. _ Stas Bekman JAm_pH -- Just Another mod_perl Hacker http://stason.org/ mod_perl Guide http://perl.apache.org/guide mailto:[EMAIL PROTECTED] http://ticketmaster.com http://apacheweek.com http://singlesheaven.com http://perl.apache.org http://perlmonth.com/
Re: loss of shared memory in parent httpd
I just wanted to mention that the theory that my loss of shared memory in the parent is related to swapping seems to be correct. When the lack of sharing occurs, it is correlated with my httpd processes showing a SWAP (from top/ps) of 7.5MB, which is roughly equal to the amount of sharing that I lose. I've been lowering my MaxClients setting (from 25 to 10, so far) in hopes of finding a new balance where SWAP is not used, and more RAM is on order. Thanks -bill
RE: loss of shared memory in parent httpd
It's copy-on-write. The swap is a write-to-disk. There's no such thing as sharing memory between one process on disk(/swap) and another in memory. You'll also want to look into tuning your paging algorithm. This will hold swapping at bay longer. The paging algorithm USED to be configurable with a bunch of kernal params (params.(h|c)) stuff like PAGEHANDSFREE but I really don't know anymore. HTH, -Josh -Original Message- From: Ask Bjoern Hansen [mailto:[EMAIL PROTECTED]] Sent: Tuesday, March 12, 2002 7:17 PM To: Graham TerMarsch Cc: Bill Marrs; [EMAIL PROTECTED] Subject: Re: loss of shared memory in parent httpd On Tue, 12 Mar 2002, Graham TerMarsch wrote: [...] We saw something similar here, running on Linux servers. Turned out to be that if the server swapped hard enough to swap an HTTPd out, then you basically lost all the shared memory that you had. I can't explain all of the technical details and the kernel-ness of it all, but from watching our own servers here this is what we saw on some machines that experienced quite a high load. Our quick solution was first to reduce some the number of Mod_perls that we had running, using the proxy-front-end/modperl-back-end technique, You should always do that. :-) and then supplemented that by adding another Gig of RAM to the machine. And yes, once you've lost the shared memory, there isn't a way to get it back as shared again. And yes, I've also seen that when this happens that it could full well take the whole server right down the toilet with it (as then your ~800MB of shared memory becomes ~800MB of _physical_ memory needed, and that could throw the box into swap city). I forwarded this mail to one of the CitySearch sysadmins who had told about seeing this. He is seeing the same thing (using kernel 2.4.17), except that if he disables swap then the processes will get back to reporting more shared memory. So maybe it's really just GTop or the kernel reporting swapped stuff in an odd way. No, I can't explain the nitty gritty either. :-) Someone should write up a summary of this thread and ask in a technical linux place, or maybe ask Dean Gaudet. - ask -- ask bjoern hansen, http://ask.netcetera.dk/ !try; do(); -- This message is intended only for the personal and confidential use of the designated recipient(s) named above. If you are not the intended recipient of this message you are hereby notified that any review, dissemination, distribution or copying of this message is strictly prohibited. This communication is for information purposes only and should not be regarded as an offer to sell or as a solicitation of an offer to buy any financial product, an official confirmation of any transaction, or as an official statement of Lehman Brothers. Email transmission cannot be guaranteed to be secure or error-free. Therefore, we do not represent that this information is complete or accurate and it should not be relied upon as such. All information is subject to change without notice.
Re: loss of shared memory in parent httpd
Stas Bekman wrote: Bill Marrs wrote: One more piece of advice: I find it easier to tune memory control with a single parameter. Setting up a maximum size and a minumum shared size is not as effective as setting up a maximum *UNSHARED* size. After all, it's the amount of real memory being used by each child that you care about, right? Apache::SizeLimit has this now, and it would be easy to add to GTopLimit (it's just $SIZE - $SHARED). Doing it this way helps avoid unnecessary process turnover. I agree. For me, with my ever more bloated Perl code, I find this unshared number to be easier to keep a lid on. I keep my apache children under 10MB each unshared as you say. That number is more stable that the SIZE/SHARED numbers that GTopLimmit offers. But, I have the GTopLimit sources, so I plan to tweak them to allow for an unshared setting. I think I bugged Stas about this a year ago and he had a reason why I was wrong to think this way, but I never understood it. I don't remember why I was argueing against it :) But in any case, I'll simply add this third option, so you can control it by either SIZE/SHARED or UNSHARED. it's on CPAN now. _ Stas Bekman JAm_pH -- Just Another mod_perl Hacker http://stason.org/ mod_perl Guide http://perl.apache.org/guide mailto:[EMAIL PROTECTED] http://ticketmaster.com http://apacheweek.com http://singlesheaven.com http://perl.apache.org http://perlmonth.com/
loss of shared memory in parent httpd
I'm a heavy mod_perl user, running 3 sites as virtual servers, all with lots of custom Perl code. My httpd's are huge(~50mb), but with the help of a startup file I'm able to get them sharing most of their memory(~43). With the help of GTopLimit, I'm able to keep the memory usage under control. But... recently, something happened, and things have changed. After some random amount of time (1 to 40 minutes or so, under load), the parent httpd suddenly loses about 7-10mb of share between it and any new child it spawns. As you can imagine, the memory footprint of my httpds skyrockets and the delicate balance I set up is disturbed. Also, GTopLimit is no help in this case - it actually causes flailing because each new child starts with memory sharing that is out of bounds and is thus killed very quickly. Restarting Apache resets the memory usage and restores the server to smooth operation. Until, it happens again. Using GTop() to get the shared memory of each child before and after running my perl for each page load showed that it wasn't my code causing the jump, but suddenly the child, after having a good amount of shared memory in use, loses a 10MB chunk and from then on the other children follow suit. So, something I did on the server (I'm always doing stuff!) has caused this change to happen, but I've been pulling my hair out for days trying to track it down. I am now getting desperate. One of the recent things I did was to run tux (another web server) to serve my images, but I don't see how that could have any effect on this. If anyone has any ideas what might cause the httpd parent (and new children) to lose a big chunk of shared memory between them, please let me know. Thanks in advance, -bill
Re: loss of shared memory in parent httpd
At 09:18 AM 3/12/02 -0500, Bill Marrs wrote: If anyone has any ideas what might cause the httpd parent (and new children) to lose a big chunk of shared memory between them, please let me know. I've seen this happen many times. One day it works fine, the next you're in trouble. And in my experience, it's not a matter of why this avalanche effect happens, but is it more a matter of why didn't it happen before? You may not have realised it that you were just below a threshold and now you're over it. And the change can be as small as the size of a heavily used template that suddenly gets over an internal memory allocation border, which in turn causes Perl to allocate more, which in turn causes memory to become unshared. I have been thinking about a perl/C routine that would internally use all of the memory that was already allocated by Perl. Such a routine would need to be called when the initial start of Apache is complete so that any child that is spawned has a saturated memory pool, so that any new variables would need to use newly allocated memory, which would be unshared. But at least all of that memory would be used for new variables and not have the tendency to pollute old memory segments. I'm not sure whether my assessment of the problem is correct. I would welcome any comments on this. I have two ideas that might help: - - other than making sure that you have the most up-to-date (kernel) version of your OS. Older Linux kernels seem to have this problem a lot more than newer kernels. I wish you strength in fixing this problem... Elizabeth Mattijsen
Re: loss of shared memory in parent httpd (2)
Oops. Premature sending... I have two ideas that might help: - reduce number of global variables used, less memory pollution by lexicals - make sure that you have the most up-to-date (kernel) version of your OS. Newer Linux kernels seem to be a lot savvy at handling shared memory than older kernels. Again, I wish you strength in fixing this problem... Elizabeth Mattijsen
Re: loss of shared memory in parent httpd
Bill Marrs wrote: I'm a heavy mod_perl user, running 3 sites as virtual servers, all with lots of custom Perl code. My httpd's are huge(~50mb), but with the help of a startup file I'm able to get them sharing most of their memory(~43). With the help of GTopLimit, I'm able to keep the memory usage under control. But... recently, something happened, and things have changed. After some random amount of time (1 to 40 minutes or so, under load), the parent httpd suddenly loses about 7-10mb of share between it and any new child it spawns. As you can imagine, the memory footprint of my httpds skyrockets and the delicate balance I set up is disturbed. Also, GTopLimit is no help in this case - it actually causes flailing because each new child starts with memory sharing that is out of bounds and is thus killed very quickly. Restarting Apache resets the memory usage and restores the server to smooth operation. Until, it happens again. Using GTop() to get the shared memory of each child before and after running my perl for each page load showed that it wasn't my code causing the jump, but suddenly the child, after having a good amount of shared memory in use, loses a 10MB chunk and from then on the other children follow suit. So, something I did on the server (I'm always doing stuff!) has caused this change to happen, but I've been pulling my hair out for days trying to track it down. I am now getting desperate. One of the recent things I did was to run tux (another web server) to serve my images, but I don't see how that could have any effect on this. If anyone has any ideas what might cause the httpd parent (and new children) to lose a big chunk of shared memory between them, please let me know. I assume that you are on linux (tux :). look at the output of 'free' -- how much swap is used before you start the server and when the horrors begin? Your delicate ballance could be ruined when the system starts to swap and the load doesn't go away. Therefore what you see is normal. Notice that it's possible that you didn't add a single line of code to your webserver, but updated some other app running on the same machine which started to use more memory, and that takes the ballance off. Hope that my guess was right. If so make sure that your system never swaps. Swap is for emergency short term extra memory requirement, not for normal operation. _ Stas Bekman JAm_pH -- Just Another mod_perl Hacker http://stason.org/ mod_perl Guide http://perl.apache.org/guide mailto:[EMAIL PROTECTED] http://ticketmaster.com http://apacheweek.com http://singlesheaven.com http://perl.apache.org http://perlmonth.com/
Re: loss of shared memory in parent httpd
On Tue, 12 Mar 2002 09:18:32 -0500 Bill Marrs [EMAIL PROTECTED] wrote: But... recently, something happened, and things have changed. After some random amount of time (1 to 40 minutes or so, under load), the parent httpd suddenly loses about 7-10mb of share between it and any new child it spawns. As you can imagine, the memory footprint of my httpds skyrockets and the delicate balance I set up is disturbed. Also, GTopLimit is no help Restarting Apache resets the memory usage and restores the server to smooth operation. Until, it happens again. Hi Bill I can't give you a decent answer, but I have noticed this as well, and my impression is that this happens when your httpd's are swapped out (when your system runs short of free memory) - or perhaps just when the parent httpd is swapped out (been a while - can't remember exactly what symptoms I observed). I think the whole phenomenon is a side-effect of normal memory management - I'm sure someone on the list will have a proper explanation. Bye Paolo
Re: loss of shared memory in parent httpd
Elizabeth Mattijsen wrote: At 09:18 AM 3/12/02 -0500, Bill Marrs wrote: If anyone has any ideas what might cause the httpd parent (and new children) to lose a big chunk of shared memory between them, please let me know. I've seen this happen many times. One day it works fine, the next you're in trouble. And in my experience, it's not a matter of why this avalanche effect happens, but is it more a matter of why didn't it happen before? You may not have realised it that you were just below a threshold and now you're over it. And the change can be as small as the size of a heavily used template that suddenly gets over an internal memory allocation border, which in turn causes Perl to allocate more, which in turn causes memory to become unshared. I have been thinking about a perl/C routine that would internally use all of the memory that was already allocated by Perl. Such a routine would need to be called when the initial start of Apache is complete so that any child that is spawned has a saturated memory pool, so that any new variables would need to use newly allocated memory, which would be unshared. But at least all of that memory would be used for new variables and not have the tendency to pollute old memory segments. I'm not sure whether my assessment of the problem is correct. I would welcome any comments on this. Nope Elizabeth, your explanation is not so correct. ;) Shared memory is not about sharing the pre-allocated memory pool (heap memory). Once you re-use a bit of preallocated memory the sharing goes away. Shared memory is about 'text'/read-only memory pages which never get modified and pages that can get modified but as long as they aren't modified they are shared. Unfortunately (in this aspect, but fortunately for many other aspects) Perl is not a strongly-typed (or whatever you call it) language, therefore it's extremely hard to share memory, because in Perl almost everything is data. Though as you could see Bill was able to share 43M out of 50M which is damn good! _ Stas Bekman JAm_pH -- Just Another mod_perl Hacker http://stason.org/ mod_perl Guide http://perl.apache.org/guide mailto:[EMAIL PROTECTED] http://ticketmaster.com http://apacheweek.com http://singlesheaven.com http://perl.apache.org http://perlmonth.com/
Re: loss of shared memory in parent httpd
At 11:46 PM 3/12/02 +0800, Stas Bekman wrote: I'm not sure whether my assessment of the problem is correct. I would welcome any comments on this. Nope Elizabeth, your explanation is not so correct. ;) Too bad... ;-( Shared memory is not about sharing the pre-allocated memory pool (heap memory). Once you re-use a bit of preallocated memory the sharing goes away. I think the phrase is Copy-On-Write, right? And since RAM is allocated in chunks, let's assume 4K for the sake of the argument, changing a single byte in such a chunk causes the entire chunk to be unshared. In older Linux kernels, I believe to have seen that when a byte gets changed in a chunk of any child, that chunk becomes changed for _all_ children. Newer kernels only unshare it for that particular child. Again, if I'm not mistaken and someone please correct me if I'm wrong... Since Perl is basically all data, you would need to find a way of localizing all memory that is changing to as few memory chunks as possible. My idea was just that: by filling up all used memory before spawning children, you would use op some memory, but that would be shared between all children and thus not so bad. But by doing this, you would hopefully cause all changing data to be localized to newly allocated memory by the children. Wish someone with more Perl guts experience could tell me if that really is an idea that could work or not... Shared memory is about 'text'/read-only memory pages which never get modified and pages that can get modified but as long as they aren't modified they are shared. Unfortunately (in this aspect, but fortunately for many other aspects) Perl is not a strongly-typed (or whatever you call it) language, therefore it's extremely hard to share memory, because in Perl almost everything is data. Though as you could see Bill was able to share 43M out of 50M which is damn good! As a proof of concept I have run more than 100 200MB+ children on a 1 GB RAM machine and had sharing go up so high causing the top number of bytes field for shared memory to cycle through its 32-bit range multiple times... ;-) . It was _real_ fast (had all of its data that it needed as Perl hashes and lists) and ran ok until something would start an avalanche effect and it would all go down in a whirlwind of swapping. So in the end, it didn't work reliably enough ;-( But man, was it fast when it ran... ;-) Elizabeth Mattijsen
Re: loss of shared memory in parent httpd
At 12:43 AM 3/13/02 +0800, Stas Bekman wrote: Doug has plans for a much improved opcode tree sharing for mod_perl 2.0, the details are kept as a secret so far :) Can't wait to see that! This topic is covered (will be) in the upcoming mod_perl book, where we include the following reference materials which you may find helpful for understanding the shared memory concepts. Ah... ok... can't wait for that either... ;-) Don't you love mod_perl for what it makes you learn :) Well, yes and no... ;-) Elizabeth Mattijsen
Re: loss of shared memory in parent httpd
Elizabeth Mattijsen wrote: Since Perl is basically all data, you would need to find a way of localizing all memory that is changing to as few memory chunks as possible. That certainly would help. However, I don't think you can do that in any easy way. Perl doesn't try to keep compiled code on separate pages from variable storage. - Perrin
Re: loss of shared memory in parent httpd
On Tuesday 12 March 2002 06:18, you wrote: I'm a heavy mod_perl user, running 3 sites as virtual servers, all with lots of custom Perl code. My httpd's are huge(~50mb), but with the help of a startup file I'm able to get them sharing most of their memory(~43). With the help of GTopLimit, I'm able to keep the memory usage under control. But... recently, something happened, and things have changed. After some random amount of time (1 to 40 minutes or so, under load), the parent httpd suddenly loses about 7-10mb of share between it and any new child it spawns. As you can imagine, the memory footprint of my httpds skyrockets and the delicate balance I set up is disturbed. Also, GTopLimit is no help in this case - it actually causes flailing because each new child starts with memory sharing that is out of bounds and is thus killed very quickly. We saw something similar here, running on Linux servers. Turned out to be that if the server swapped hard enough to swap an HTTPd out, then you basically lost all the shared memory that you had. I can't explain all of the technical details and the kernel-ness of it all, but from watching our own servers here this is what we saw on some machines that experienced quite a high load. Our quick solution was first to reduce some the number of Mod_perls that we had running, using the proxy-front-end/modperl-back-end technique, and then supplemented that by adding another Gig of RAM to the machine. And yes, once you've lost the shared memory, there isn't a way to get it back as shared again. And yes, I've also seen that when this happens that it could full well take the whole server right down the toilet with it (as then your ~800MB of shared memory becomes ~800MB of _physical_ memory needed, and that could throw the box into swap city). -- Graham TerMarsch Howling Frog Internet Development, Inc. http://www.howlingfrog.com
Re: loss of shared memory in parent httpd
Bill Marrs wrote: But... recently, something happened, and things have changed. After some random amount of time (1 to 40 minutes or so, under load), the parent httpd suddenly loses about 7-10mb of share between it and any new child it spawns. One possible reason is that a perl memory structure in there might be changing. Perl is able to grow variables dynamically by allocating memory in buckets, and it tends to be greedy when grabbing more. You might trigger another large allocation by something as simple as implicitly converting a string to a number, or adding one element to an array. Over time, I always see the parent process lose some shared memory. My advice is to base your tuning not on the way it looks right after you start it, but on the way it looks after serving pages for a few hours. Yes, you will underutilize the box just after a restart, but you will also avoid overloading it when things get going. I also recommend restarting your server every 24 hours, to reset things. One more piece of advice: I find it easier to tune memory control with a single parameter. Setting up a maximum size and a minumum shared size is not as effective as setting up a maximum *UNSHARED* size. After all, it's the amount of real memory being used by each child that you care about, right? Apache::SizeLimit has this now, and it would be easy to add to GTopLimit (it's just $SIZE - $SHARED). Doing it this way helps avoid unnecessary process turnover. - Perrin
Re: loss of shared memory in parent httpd
Thanks for all the great advice. A number of you indicated that it's likely due to my apache processes being partially swapped to disk. That seems likely to me. I haven't had a chance to prove that point, but when it does it again and I'm around, I plan to test it with free/top (top has a SWAP column which should show if my apaches are swapped out at all). I am in the process of getting a memory upgrade, so that should ease this problem. Meanwhile, I can set MaxClients lower and see if that keeps me out of trouble as well. I suspect adding the tux server disrupted the balance I had (apparently, I was tuned pretty close to my memory limits!) Yes, I am running on Linux... One more piece of advice: I find it easier to tune memory control with a single parameter. Setting up a maximum size and a minumum shared size is not as effective as setting up a maximum *UNSHARED* size. After all, it's the amount of real memory being used by each child that you care about, right? Apache::SizeLimit has this now, and it would be easy to add to GTopLimit (it's just $SIZE - $SHARED). Doing it this way helps avoid unnecessary process turnover. I agree. For me, with my ever more bloated Perl code, I find this unshared number to be easier to keep a lid on. I keep my apache children under 10MB each unshared as you say. That number is more stable that the SIZE/SHARED numbers that GTopLimmit offers. But, I have the GTopLimit sources, so I plan to tweak them to allow for an unshared setting. I think I bugged Stas about this a year ago and he had a reason why I was wrong to think this way, but I never understood it. -bill
Re: loss of shared memory in parent httpd
On Tue, 12 Mar 2002, Graham TerMarsch wrote: [...] We saw something similar here, running on Linux servers. Turned out to be that if the server swapped hard enough to swap an HTTPd out, then you basically lost all the shared memory that you had. I can't explain all of the technical details and the kernel-ness of it all, but from watching our own servers here this is what we saw on some machines that experienced quite a high load. Our quick solution was first to reduce some the number of Mod_perls that we had running, using the proxy-front-end/modperl-back-end technique, You should always do that. :-) and then supplemented that by adding another Gig of RAM to the machine. And yes, once you've lost the shared memory, there isn't a way to get it back as shared again. And yes, I've also seen that when this happens that it could full well take the whole server right down the toilet with it (as then your ~800MB of shared memory becomes ~800MB of _physical_ memory needed, and that could throw the box into swap city). I forwarded this mail to one of the CitySearch sysadmins who had told about seeing this. He is seeing the same thing (using kernel 2.4.17), except that if he disables swap then the processes will get back to reporting more shared memory. So maybe it's really just GTop or the kernel reporting swapped stuff in an odd way. No, I can't explain the nitty gritty either. :-) Someone should write up a summary of this thread and ask in a technical linux place, or maybe ask Dean Gaudet. - ask -- ask bjoern hansen, http://ask.netcetera.dk/ !try; do();
Re: loss of shared memory in parent httpd
No, I can't explain the nitty gritty either. :-) Someone should write up a summary of this thread and ask in a technical linux place, or maybe ask Dean Gaudet. I believe this is a linux/perl issue... stand alone daemons exhibit the same behaviour... e.g. if you've got a parent PERL daemon that fork()s ... swapping in data from a child does _not_ have any affect on other copies of that memory. I'm sure swapping in the memory of the parent before fork()ing would be fine Admittedly, my experience is from old linux kernels (2.0), but I would not be suprised if current ones are similar. I'm sure it is the same on some other platforms, but I haven't used much else for a long time. -- [EMAIL PROTECTED] | Put all your eggs in one basket and http://BareMetal.com/ | WATCH THAT BASKET! web hosting since '95 | - Mark Twain
Re: loss of shared memory in parent httpd
Bill Marrs wrote: One more piece of advice: I find it easier to tune memory control with a single parameter. Setting up a maximum size and a minumum shared size is not as effective as setting up a maximum *UNSHARED* size. After all, it's the amount of real memory being used by each child that you care about, right? Apache::SizeLimit has this now, and it would be easy to add to GTopLimit (it's just $SIZE - $SHARED). Doing it this way helps avoid unnecessary process turnover. I agree. For me, with my ever more bloated Perl code, I find this unshared number to be easier to keep a lid on. I keep my apache children under 10MB each unshared as you say. That number is more stable that the SIZE/SHARED numbers that GTopLimmit offers. But, I have the GTopLimit sources, so I plan to tweak them to allow for an unshared setting. I think I bugged Stas about this a year ago and he had a reason why I was wrong to think this way, but I never understood it. I don't remember why I was argueing against it :) But in any case, I'll simply add this third option, so you can control it by either SIZE/SHARED or UNSHARED. _ Stas Bekman JAm_pH -- Just Another mod_perl Hacker http://stason.org/ mod_perl Guide http://perl.apache.org/guide mailto:[EMAIL PROTECTED] http://ticketmaster.com http://apacheweek.com http://singlesheaven.com http://perl.apache.org http://perlmonth.com/
Shared memory caching revisited (was it's supposed to SHARE it, not make more!)
One of the shiny golden nuggets I received from said slice was a shared memory cache. It was simple, it was elegant, it was perfect. It was also based on IPC::Shareable. GREAT idea. BAD juju. Just use Cache::Cache. It's faster and easier. Now, ya see... Once upon a time, not many moons ago, the issue of Cache::Cache came up with the SharedMemory Cache and the fact that it has NO locking semantics. When I found this thread in searching for ways to implement my own locking scheme to make up for this lack, I came upon YOUR comments that perhaps Apache::Session::Lock::Semaphore could be used, without any of the rest of the Apache::Session package. That was a good enough lead for me. So a went into the manpage, and I went into the module, and then I mis-understood how the semaphore key was determined, and wasted a good hour or two trying to patch it. Then I reverted to my BASICS: Data::Dumper is your FRIEND. Print DEBUGGING messages. Duh, of course, except for some reason I didn't think to worry about it, at first, in somebody else's module. sigh So, see what I did wrong, undo the patches, and: A:S:L:S makes the ASSUMPTION that the argument passed to its locking methods is an Apache::Session object. Specifically, that it is a hashref of the following (at least partial) structure: { data = { _session_id = (something) } } The _session_id is used as the seed for the locking semaphore. *IF* I understood the requirements correctly, the _session_id has to be the same FOR EVERY PROCESS in order for the locking to work as desired, for a given shared data structure. So my new caching code is at the end of this message. ***OH WOW!*** So, DURING the course of composing this message, I've realized that the function expire_old_accounts() is now redundant! Cache::Cache takes care of that, both with expires_in and max_size. I'm leaving it in for reference, just to show how it's improved. :-) ***OH WOW! v1.1*** :-) I've also just now realized that the call to bind_accounts() could actually go right inside lookup_account(), if: 1) lookup_account() is the only function using the cache, or 2) lookup_account() is ALWAYS THE FIRST function to access the cache, or 3) every OTHER function accessing the cache has the same call, of the form bind() unless defined $to_bind; I think for prudence I'll leave outside for now. L8r, Rob %= snip =% use Apache::Session::Lock::Semaphore (); use Cache::SizeAwareSharedMemoryCache (); # this is used in %cache_options, as well as for locking use constant SIGNATURE = 'EXIT'; use constant MAX_ACCOUNTS = 300; # use vars qw/%ACCOUNTS/; use vars qw/$ACCOUNTS $locker/; my %cache_options = ( namespace = SIGNATURE, default_expires_in = max_size = MAX_ACCOUNTS ); sub handler { # ... init code here. parse $account from the request, and then: bind_accounts() unless defined($ACCOUNTS); # verify (access the cache) my $accountinfo = lookup_account($account) or $r-log_reason(no such account: $account), return HTTP_NO_CONTENT; # ... content here } # Bind the account variables to shared memory sub bind_accounts { warn bind_accounts: Binding shared memory if $debug; $ACCOUNTS = Cache::SizeAwareSharedMemoryCache-new( \%cache_options ) or croak( Couldn't instantiate SizeAwareSharedMemoryCache : $! ); # Shut up Apache::Session::Lock::Semaphore $ACCOUNTS-{data}-{_session_id} = join '', SIGNATURE, @INC; $locker = Apache::Session::Lock::Semaphore-new(); # not quite ready to trust this yet. :-) # We'll keep it separate for now. # #$ACCOUNTS-set('locker', $locker); warn bind_accounts: done if $debug; } ### DEPRECATED! Cache::Cache does this FOR us! # bring the current session to the front and # get rid of any that haven't been used recently sub expire_old_accounts { ### DEPRECATED! return; my $id = shift; warn expire_old_accounts: entered\n if $debug; $locker-acquire_write_lock($ACCOUNTS); #tied(%ACCOUNTS)-shlock; my @accounts = grep( $id ne $_, @{$ACCOUNTS-get('QUEUE') || []} ); unshift @accounts, $id; if (@accounts MAX_ACCOUNTS) { my $to_delete = pop @accounts; $ACCOUNTS-remove($to_delete); } $ACCOUNTS-set('QUEUE', \@accounts); $locker-release_write_lock($ACCOUNTS); #tied(%ACCOUNTS)-shunlock; warn expire_old_accounts: done\n if $debug; } sub lookup_account { my $id = shift; warn lookup_account: begin if $debug; expire_old_accounts($id); warn lookup_account: Accessing \$ACCOUNTS{$id} if $debug; my $s = $ACCOUNTS-get($id); if (defined $s) { # SUCCESSFUL CACHE HIT warn lookup_account: Retrieved accountinfo from Cache (bypassing SQL) if $debug; return $s; } ## NOT IN CACHE... refreshing. warn lookup_account: preparing SQL if $debug; # ... do some SQL here. Assign results
RE: Shared memory caching revisited (was it's supposed to SHARE it, not make more!)
The _session_id is used as the seed for the locking semaphore. *IF* I understood the requirements correctly, the _session_id has to be the same FOR EVERY PROCESS in order for the locking to work as desired, for a given shared data structure. Only if you want to lock the whole thing, rather than a single record. Cache::Cache typically updates just one record at a time, not the whole data structure, so you should only need to lock that one record. Uhh... good point, except that I don't trust the Cache code. The AUTHOR isn't ready to put his stamp of approval on the locking/updating. I'm running 10 hits/sec on this server, and last write wins, which ELIMINATES other writes, is not acceptable. I had a quick look at your code and it seems redundant with Cache::Cache. You're using the locking just to ensure safe updates, which is already done for you. Well, for a single, atomic lock, maybe. My two points above are the why of my hesitancy. Additionally, what if I decide to add to my handler? What if I update more than one thing at once? Now I've got the skeleton based on something that somebody trusts (A:S:L:S), vs what somebody thinks is alpha/beta (C:SASMC). In other words TIMTOWTDI! :-) L8r, Rob
RE: Shared memory caching revisited (was it's supposed to SHARE it, not make more!)
Uhh... good point, except that I don't trust the Cache code. The AUTHOR isn't ready to put his stamp of approval on the locking/updating. That sort of hesitancy is typical of CPAN. I wouldn't worry about it. I think I remember Randal saying he helped a bit with that part. In my opinion, there is no good reason to think that the Apache::Session locking code is in better shape than the Cache::Cache locking, unless you've personally reviewed the code in both modules. Well, the fact is, I respect your opinion. And YES, it seems like I'm doing more work than is probably necessary. I've been screwed over SO MANY TIMES by MYSELF not thinking of some little detail, than I've developed a tendency to design in redundant design redundancy :-) so that if one thing fails, the other will catch it. This reduces downtime... I'm running 10 hits/sec on this server, and last write wins, which ELIMINATES other writes, is not acceptable. As far as I can see, that's all that your code is doing. You're simply locking when you write, in order to prevent corruption. You aren't acquiring an exclusive lock when you read, so anyone could come in between your read and write and make an update which would get overwritten when you write, i.e. last write wins. Again, good point... I'm coding as if the WHOLE cache structure will break if any little thing gets out of line. I was trying to think in terms of data safety like one would with threading, because A) I was worried about weather shared memory was as sensitive to locks/corruption as threading, and B) I reviewed Apache::Session's lock code, but didn't review Cache::Cache's (20/20 hindsight, ya know). You're more than welcome to roll your own solution based on your personal preferences, but I don't want people to get the wrong idea about Cache::Cache. It handles the basic locking needed for safe updates. Then my code just got waaay simpler, both in terms of data flow and individual coding sections. THANK YOU! :-) L8r, Rob #!/usr/bin/perl -w use Disclaimer qw/:standard/;
Re: Shared memory caching revisited (was it's supposed to SHARE it, not make more!)
On Tue, Sep 04, 2001 at 12:14:52PM -0700, Rob Bloodgood wrote: ***OH WOW!*** So, DURING the course of composing this message, I've realized that the function expire_old_accounts() is now redundant! Cache::Cache takes care of that, both with expires_in and max_size. I'm leaving it in for reference, just to show how it's improved. :-) [snip] use Apache::Session::Lock::Semaphore (); use Cache::SizeAwareSharedMemoryCache (); # this is used in %cache_options, as well as for locking use constant SIGNATURE = 'EXIT'; use constant MAX_ACCOUNTS = 300; # use vars qw/%ACCOUNTS/; use vars qw/$ACCOUNTS $locker/; my %cache_options = ( namespace = SIGNATURE, default_expires_in = max_size = MAX_ACCOUNTS ); Very neat thought about how to use max_size to limit the the accounts! Unfortunately, you demonstrated that I did a *terrible* job at documenting what size means. It means size in bytes, not items. I will add max_items and limit_items to the TODO list. In the meantime, I will improve the documentation. -DeWitt
RE: Shared memory caching revisited (was it's supposed to SHARE it, not make more!)
What about my IPC::FsSharevars? I've once mentioned it on this list, but I don't have the time to read all list mail, so maybe I've missed some conclusions following the discussion from last time. I remember the post and went to find IPC::FsSharevars a while ago and was un-intrigued when I didn't find it on CPAN. has there been any feedback from the normal perl module forums? --Geoff
RE: Shared memory caching revisited (was it's supposed to SHARE it, not make more!)
At 20:37 Uhr -0400 4.9.2001, Geoffrey Young wrote: I remember the post and went to find IPC::FsSharevars a while ago and was un-intrigued when I didn't find it on CPAN. has there been any feedback from the normal perl module forums? I haven't announced it on other forums (yet). (I think it's more of a working version yet that needs feedback and some work to make it generally useable (i.e. under mod_perl). Which forum should I post on?) christian
Re: Shared memory caching revisited (was it's supposed to SHARE it, not make more!)
Christian == Christian Jaeger [EMAIL PROTECTED] writes: Christian I haven't announced it on other forums (yet). (I think it's Christian more of a working version yet that needs feedback and some Christian work to make it generally useable (i.e. under Christian mod_perl). Which forum should I post on?) If you put it on the CPAN with a version number below 1, that's usually a clue that it's still alpha or beta. Then you can announce it through the normal module announcement structures. If you hide it, I'm sure not installing it. -- Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095 [EMAIL PROTECTED] URL:http://www.stonehenge.com/merlyn/ Perl/Unix/security consulting, Technical writing, Comedy, etc. etc. See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!
Re: Shared memory caching revisited (was it's supposed to SHARE it, not make more!)
Perrin == Perrin Harkins [EMAIL PROTECTED] writes: Uhh... good point, except that I don't trust the Cache code. The AUTHOR isn't ready to put his stamp of approval on the locking/updating. Perrin That sort of hesitancy is typical of CPAN. I wouldn't worry Perrin about it. I think I remember Randal saying he helped a bit Perrin with that part. I helped with the code that ensures that *file* writes are atomic updates. I taught DeWitt the trick of writing to a temp file, then renaming when ready, so that any readers see only the old file or the new file, but never a partially written file. I don't think Cache::Cache has enough logic for an atomic read-modify-write in any of its modes to implement (for example) a web hit counter. It has only atomic write. The last write wins strategy is fine for caching, but not for transacting, so I can see why Rob is a bit puzzled. It'd be nice if we could build a generic atomic read-modify-write, but now we're back to Apache::Session, which in spite of its name works fine away from Apache. :) Caching. An area of interest of mine, but I still don't seem to get around to really writing the framework I want, so all I can do is keep lobbing grenades into the parts I don't want. :) :) Sorry guys. :) -- Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095 [EMAIL PROTECTED] URL:http://www.stonehenge.com/merlyn/ Perl/Unix/security consulting, Technical writing, Comedy, etc. etc. See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!
Re: Shared memory caching revisited (was it's supposed to SHARE it, not make more!)
I don't think Cache::Cache has enough logic for an atomic read-modify-write in any of its modes to implement (for example) a web hit counter. It has only atomic write. The last write wins strategy is fine for caching, but not for transacting, so I can see why Rob is a bit puzzled. In his example code he was only doing atomic writes as well, so it should work at least as well for his app as what he had before. It'd be nice if we could build a generic atomic read-modify-write Maybe a get_for_update() method is what's needed. It would block any other process from doing a set() or a get_for_update() until the set() for that key has completed. It's still just advisory locking though, so if you forget and use a regular get() for some data you later plan to set(), you will not be getting atomic read-modify-write. Maybe get() could be re-named to get_read_only(), or set a flag that prevents saving the fetched data. Most caching apps are happy with last save wins though, so I guess anything like that would need to be optional. - Perrin
Re: Increasing Shared Memory
Quoting Joshua Chamas [EMAIL PROTECTED]: Also, more a side note, I have found that you have to fully restart apache, not just a graceful, if either the Oracle server is restarted or the TNS listener is restarted. We fixed this at eToys by having children that failed to connect to the database exit after the current request. Newly spawned children seem to connect okay. - Perrin
Re: Increasing Shared Memory
Quoting Bob Foster [EMAIL PROTECTED]: Immediately after I make an Oracle database connection, the child jumps from a size of 3.6M (2.4M shared) to 17.4M (3.4M shared). The child process slowly grows to 22.2M (3.4M shared). The loaded libs Sizes total 13.6M. Shouldn't the libs load into shared memory? If so, how? DBI-install_driver() See http://perl.apache.org/guide/performance.html#Initializing_DBI_pm for more. - Perrin
Increasing Shared Memory
Hi, I'm using Stas Bekman's excellent Apache::VMonitor module to help me decrease my mod_perl child process memory usage. I was working on preloading all of my perl modules and scripts in a startup.pl script when I noticed that the amount of shared memory seemed very low. Immediately after I make an Oracle database connection, the child jumps from a size of 3.6M (2.4M shared) to 17.4M (3.4M shared). The child process slowly grows to 22.2M (3.4M shared). The loaded libs Sizes total 13.6M. Shouldn't the libs load into shared memory? If so, how? Here's a snippet from Apache::VMonitor. Loaded libs Sizes: (in bytes) 5800677 ( 5.5M) : /ora1/oracle/product/8.1.5/lib/libclntsh.so.8.0 4118299 ( 3.9M) : /lib/libc-2.1.2.so 1134094 ( 1.1M) : /usr/local/apache_perl/bin/httpd_perl 540120 ( 527k) : /lib/libm-2.1.2.so 372604 ( 364k) : /lib/libnsl-2.1.2.so 344890 ( 337k) : /lib/ld-2.1.2.so 254027 ( 248k) : /lib/libnss_nis-2.1.2.so 253826 ( 248k) : /lib/libnss_nisplus-2.1.2.so 247381 ( 242k) : /lib/libpthread-0.8.so 247348 ( 242k) : /lib/libnss_files-2.1.2.so 162740 ( 159k) : /usr/lib/libglib-1.2.so.0.0.5 81996 ( 80k) : /usr/lib/perl5/site_perl/5.005/i386-linux/auto/DBD/Oracle/Oracle.so 74663 ( 73k) : /lib/libdl-2.1.2.so 64595 ( 63k) : /lib/libcrypt-2.1.2.so 63848 ( 62k) : /usr/lib/perl5/site_perl/5.005/i386-linux/auto/DBI/DBI.so 60026 ( 59k) : /usr/lib/libgtop_sysdeps.so.1.0.2 58088 ( 57k) : /usr/lib/perl5/site_perl/5.005/i386-linux/auto/Date/Calc/Calc.so 51146 ( 50k) : /usr/lib/perl5/site_perl/5.005/i386-linux/auto/Storable/Storable.so 48500 ( 47k) : /usr/lib/perl5/site_perl/5.005/i386-linux/auto/GTop/GTop.so 46795 ( 46k) : /usr/local/apache_perl/libexec/mod_include.so 32545 ( 32k) : /usr/lib/gconv/ISO8859-1.so 31927 ( 31k) : /usr/lib/libgtop.so.1.0.2 29970 ( 29k) : /usr/share/locale/en_US/LC_COLLATE 28657 ( 28k) : /usr/lib/libgdbm.so.2.0.0 27405 ( 27k) : /usr/local/apache_perl/libexec/mod_status.so 24409 ( 24k) : /usr/lib/perl5/site_perl/5.005/i386-linux/auto/Apache/Scoreboard/Scoreboard.so 22970 ( 22k) : /usr/lib/perl5/5.00503/i386-linux/auto/Data/Dumper/Dumper.so 17732 ( 17k) : /usr/lib/libgtop_common.so.1.0.2 14748 ( 14k) : /usr/lib/perl5/5.00503/i386-linux/auto/IO/IO.so 11137 ( 11k) : /usr/lib/perl5/site_perl/5.005/i386-linux/auto/Time/HiRes/HiRes.so 10764 ( 11k) : /usr/local/apache_perl/libexec/mod_usertrack.so 10428 ( 10k) : /usr/share/locale/en_US/LC_CTYPE 9827 ( 10k) : /usr/lib/perl5/5.00503/i386-linux/auto/Fcntl/Fcntl.so 4847 ( 5k) : /ora1/oracle/product/8.1.5/lib/libskgxp8.so 508 ( 1k) : /usr/share/locale/en_US/LC_TIME 93 ( 1k) : /usr/share/locale/en_US/LC_MONETARY 44 ( 1k) : /usr/share/locale/en_US/LC_MESSAGES/SYS_LC_MESSAGES 27 ( 1k) : /usr/share/locale/en_US/LC_NUMERIC Thanks for your help! Bob Foster
Re: Increasing Shared Memory
Bob Foster wrote: Hi, I'm using Stas Bekman's excellent Apache::VMonitor module to help me decrease my mod_perl child process memory usage. I was working on preloading all of my perl modules and scripts in a startup.pl script when I noticed that the amount of shared memory seemed very low. Immediately after I make an Oracle database connection, the child jumps from a size of 3.6M (2.4M shared) to 17.4M (3.4M shared). The child process slowly grows to 22.2M (3.4M shared). The loaded libs Sizes total 13.6M. Shouldn't the libs load into shared memory? If so, how? Make sure to use DBD::Oracle in your startup.pl or do PerlModule DBD::Oracle ... that should load up some Oracle libs in the parent. Also, you *might* try doing a connect or even an invalid connect to Oracle, which might grab some extra libs that it only loads at runtime. If you do a connect in the parent httpd to Oracle, make sure that Apache::DBI doesn't cache it later for the children, which is why I was suggesting doing an invalid connect with user/pass like BADBAD/BADBAD --Josh _ Joshua Chamas Chamas Enterprises Inc. NodeWorks Founder Huntington Beach, CA USA http://www.nodeworks.com1-714-625-4051
Re(2): Increasing Shared Memory
[EMAIL PROTECTED] writes: Make sure to use DBD::Oracle in your startup.pl or do PerlModule DBD::Oracle ... that should load up some Oracle libs in the parent. Also, you *might* try doing a connect or even an invalid connect to Oracle, which might grab some extra libs that it only loads at runtime. If you do a connect in the parent httpd to Oracle, make sure that Apache::DBI doesn't cache it later for the children, which is why I was suggesting doing an invalid connect with user/pass like BADBAD/BADBAD Thank you very much, Joshua. I have made some progress and am now seeing 15.8M shared out of 16.7M on the parent. I believe that the problem was that I was doing a graceful restart which wasn't restarting the parent process. Now I have a different problem. When I initially start apache, everything is OK, but when the child processes die (due to MaxRequestsPerChild or me killing them) , the new child doesn't have a database connection and I'm getting Internal Server Error. I suspect I can sort that mess out after a bit more troubleshooting. Thanks again! Bob Foster
Re: Increasing Shared Memory
Bob Foster wrote: Thank you very much, Joshua. I have made some progress and am now seeing 15.8M shared out of 16.7M on the parent. I believe that the problem was that I was doing a graceful restart which wasn't restarting the parent process. Now I have a different problem. When I initially start apache, everything is OK, but when the child processes die (due to MaxRequestsPerChild or me killing them) , the new child doesn't have a database connection and I'm getting Internal Server Error. I suspect I can sort that mess out after a bit more troubleshooting. Cool, but each child will need its own database connection, and last time I worked with Oracle like this, it was a minimum 3M unshared just for the Oracle connection. So just make sure that each child httpd is doing its own DBI-connect() and not reusing a handle created in the parent httpd or you will have problems. Also, more a side note, I have found that you have to fully restart apache, not just a graceful, if either the Oracle server is restarted or the TNS listener is restarted. --Josh _ Joshua Chamas Chamas Enterprises Inc. NodeWorks Founder Huntington Beach, CA USA http://www.nodeworks.com1-714-625-4051
Shared memory between child processes
I understand the forking model of Apache, and what that means in terms of data initialized in the start-up phase being ready-to-go in each child process. But what I need to do is manage it so that a particular value is shared between all children, such that changes made by one are recognized by the others. In a simple case, imagine wanting to count how many times total the given handler is invoked. Bumping a global counter will still be local to the given child process, and if part of the handler's interface is to report this counter value, then the reported number is going to be dependent upon which child answers the request. I'm needing to implement a handler that uses a true Singleton pattern for the class instance. One per server, not just one per process (or thread). Randy -- --- Randy J. Ray | Buy a copy of a baby naming book and you'll never be at a [EMAIL PROTECTED] | loss for variable names. Fred is a wonderful name, and easy +1 408 543-9482 | to type. --Roedy Green, "How To Write Unmaintainable Code"
Re: Shared memory between child processes
At 5:30 PM -0800 3/30/01, Randy J. Ray wrote: I understand the forking model of Apache, and what that means in terms of data initialized in the start-up phase being ready-to-go in each child process. But what I need to do is manage it so that a particular value is shared between all children, such that changes made by one are recognized by the others. In a simple case, imagine wanting to count how many times total the given handler is invoked. Bumping a global counter will still be local to the given child process, and if part of the handler's interface is to report this counter value, then the reported number is going to be dependent upon which child answers the request. I'm needing to implement a handler that uses a true Singleton pattern for the class instance. One per server, not just one per process (or thread). You'll need to use some form of persistance mechanism such as a database, file, or perhaps (assuming you're on a Unix system) something like System V shared memory or semaphores. One quick 'n cheap way to implement mutual exclusion between Unix processes (executing on the same processor) is to use mkdir, which is atomic (ie once a process requests a mkdir, the mkdir will either be done or rejected before the requesting process is preempted by any other process). So you can do mkdir "xyz" if "xyz" already exists, wait or return an error read or write shared variable on disc rmdir "xyz" to guarantee that only one process at a time can be trying to access a disc file. There are many possible variations on this theme.
Re: Shared memory between child processes
Hello, RRI'm needing to implement a handler that uses a true Singleton pattern for RRthe class instance. One per server, not just one per process (or thread). SLYou'll need to use some form of persistance mechanism such as a SLdatabase, file, or perhaps (assuming you're on a Unix system) SLsomething like System V shared memory or semaphores. You can find more information on maintaining server-side state in the mod_perl guide or from the mod_perl book (at perl.apache.org and www.modperl.com, respectively). SLOne quick 'n cheap way to implement mutual exclusion between Unix SLprocesses (executing on the same processor) is to use mkdir, which is SLatomic (ie once a process requests a mkdir, the mkdir will either be SLdone or rejected before the requesting process is preempted by any SLother process). IMO, this is sort of cumbersome; on a single processor, you can just use advisory file locking. It's when you get onto NFS mounted systems with high concurrency that you have to muck with rolling your own mutexes (I find I usually use atomic move for that purpose). But on a single system, use flock() and a known lockfile, or sysopen with O_CREAT|O_EXCL if you can't put the file there beforehand. Humbly, Andrew -- Andrew Ho http://www.tellme.com/ [EMAIL PROTECTED] Engineer [EMAIL PROTECTED] Voice 650-930-9062 Tellme Networks, Inc. 1-800-555-TELLFax 650-930-9101 --
Connection from mod_perl to Informix server via shared memory IPC
Hello everybody, We, I mean my company recently put our new Solaris 2.7 server into production. Software installed on this machine includes: Informix 9.21 Apache 1.3.19 Perl 5.6.0 mod_perl 1.25 DBI 1.14 DBD-Informix 1.00 PC1 Mason 1.0 Connection type for Informix was just ontlitcp. After some time, we experienced some errors reported from Informix, both in console mode and from Apache. Here are errors descriptions: -25574 Network driver cannot open the network device. A system call has failed. See the UNIX system administrator for assistance. -12 Not enough core. An operating-system error code with the meaning shown was unexpectedly returned to the database server. Core probably refers to data space in memory that an operating-system function needed. Look for other operating-system error messages that might give more information. In the same time ssh and some other TCP services worked quite good, no problem. Because this server is in production, after not successfull try to restart Informix (could not initialize shared memory), I just restarted the whole server. Then I thought let's try some other connection type for Informix and installed one more alias server using onipcshm (IPC over shared memory). This works fine for all console Perl programs connecting through the same DBD::Informix module, but it doesn't work for mod_perl invoked scripts and of course no for Mason also. So, finally we used shared memory connection for Apache WebDatablade server and console programs triggered by cron, but TLI over TCP/IP connection for separate Apache with mod_perl running on different port. The firsta Apache running on port 80 proxies requests back and forth to mod_perl enabled Apache. When problem with no core on /dev/tcp appeared again (after few days), then this mod_perl Apache server stops connecting to the database reporting 500 messages. Here are errors reported in Apache error log: -25588 The appl process cannot connect to Dynamic Server 2000 server-name. An error in the application can produce this message. Check your CONNECT statement and the sqlhosts file. A system failure can also produce this message. If you can find no immediate cause for the failure, note the circumstances and contact your database server administrator for assistance. The shared memory communication subsystem is down or not functioning. Contact the database server administrator to report the problem. -25588 The appl process cannot connect to the database server server-name. An error in the application can produce this message. Check your CONNECT statement and the sqlhosts file. A system failure can also produce this message. If you can find no immediate cause for the failure, note the circumstances and contact your database server administrator for assistance. The shared memory communication subsystem is down or not functioning. Contact the database server administrator to report the problem. and ISAM error 4, which means: -4 Interrupted system call. An operating-system error code with the meaning shown was unexpectedly returned to the database server. You might have pressed the interrupt key at a crucial moment, or the software might have generated an interrupt signal such as the UNIX command kill. Look for other operating-system error messages that might give more information. If the error recurs, please note all circumstances and contact Informix Technical Support. CGI scripts that are invoked as conventional CGI with the same Apache work OK using shared memory or whatever. I am trying to resolve the first problem with /dev/tcp device and I tweaked some parameters with ndd (after the problem started), but I am not sure yet if I fixed anything. Is it good idea at all to use connection through shared memory (at least I got information that it is not secure but it is the fastest way)? Thank you in advance. Best regards, Boban Acimovic UNIX SysAdmin
Re: mod_perl shared memory with MM
Sorry for taking a while to get back to this, road trips can be good at interrupting the flow of life. It depends on the application. I typically use a few instances of open() for the sake of simplicity, but I have also had decent luck with IPC::Open(2|3). The only problems I've had with either was an OS specific bug with Linux (the pipe was newline buffering and dropping all characters over 1023, moved to FreeBSD and the problem went away). Words of wisdom: start slow because debugging over a pipe can be a headache (understatement). Simple additions + simple debugging = good thing(tm). I've spent too many afternoons/nights ripping apart these kinds of programs only to find a small type-o and then reconstructing a much larger query/response set of programs. -sc PS You also want to attach the program listening to the named pipe to something like DJB's daemon tools (http://cr.yp.to/daemontools.html) to prevent new requests from blocking if the listener dies: bad thing(tm). On Wed, Feb 28, 2001 at 10:23:06PM -0500, Adi Fairbank wrote: Delivered-To: [EMAIL PROTECTED] Date: Wed, 28 Feb 2001 22:23:06 -0500 From: Adi Fairbank [EMAIL PROTECTED] X-Mailer: Mozilla 4.75 [en] (X11; U; Linux 2.2.14-5.0 i586) X-Accept-Language: en To: Sean Chittenden [EMAIL PROTECTED] CC: [EMAIL PROTECTED] Subject: Re: mod_perl shared memory with MM Sean, Yeah, I was thinking about something like that at first, but I've never played with named pipes, and it didn't sound too safe after reading the perlipc man page. What do you use, Perl open() calls, IPC::Open2/3, IPC::ChildSafe, or something else? How stable has it been for you? I just didn't like all those warnings in the IPC::Open2 and perlipc man pages. -Adi Sean Chittenden wrote: The night of Fat Tuesday no less... that didn't help any either. ::sigh:: Here's one possibility that I've done in the past becuase I needed mod_perl sessions to be able to talk with non-mod_perl programs. I setup a named bi-directional pipe that let you write a query to it for session information, and it wrote back with whatever you were looking for. Given that this needed to support perl, java, and c, it worked _very_ well and was extremely fast. Something you may also want to consider because it keeps your session information outside of apache (incase of restart of apache, or desire to synchronize session information across multiple hosts). -sc -- Sean Chittenden[EMAIL PROTECTED] PGP signature
Re: mod_perl shared memory with MM
At 22:23 Uhr -0500 10.3.2001, DeWitt Clinton wrote: On Sat, Mar 10, 2001 at 04:35:02PM -0800, Perrin Harkins wrote: Christian Jaeger wrote: Yes, it uses a separate file for each variable. This way also locking is solved, each variable has it's own file lock. You should take a look at DeWitt Clinton's Cache::FileCache module, announced on this list. It might make sense to merge your work into that module, which is the next generation of the popular File::Cache module. Yes! I'm actively looking for additional developers for the Perl Cache project. I'd love new implementations of the Cache interface. Cache::BerkeleyDBCache would be wonderful. Check out: http://sourceforge.net/projects/perl-cache/ For what it is worth, I don't explicitly lock. I do atomic writes instead, and have yet to hear anyone report a problem in the year the code has been public. I've looked at Cache::FileCache now and think it's (currently) not possible to use for IPC::FsSharevars: I really miss locking capabilities. Imagine a script that reads a value at the beginning of a request and writes it back at the end of the request. If it's not locked during this time, another instance can read the same value and then write another change back which is then overwritten by the first instance. IPC::FsSharevars even goes one step further: instead of locking everything for a particular session, it only locks individual variables. So you can say "I use the variables $foo and %bar from session 12345 and will write %bar back", in which case %bar of session 12345 is locked until it is written back, while $foo and @baz are still unlocked and may be read (and written) by other instances. :-) Such behaviour is useful if you have framesets where a browser may request several frames of the same session in parallel (you can see an example on http://testwww.ethz.ch, click on 'Suche' then on the submit button, the two appearing frames are executed in parallel and both access different session variables), or for handling session independant (global) data. One thing to be careful about in such situations is dead locking. IPC::FsSharevars prevents dead locks by getting all needed locks at the same time (this is done by first requesting a general session lock and then trying to lock all needed variable container files - if it fails, the session lock is freed again and the process waits for a unix signal indicating a change in the locking states). Getting all locks at the same time is more efficient than getting locks always in the same order. BTW some questions/suggestions for DeWitt: - why don't you use 'real' constants for $SUCCESS and the like? (use constant) - you probably should either append the userid of the process to /tmp/FileCache or make this folder globally writeable (and set the sticky flag). Otherwise other users get a permission error. - why don't you use Storable.pm? It should be much faster than Data::Dumper I have some preliminary benchmark code -- only good for relative benchmarking, but it is a start. I'd be happy to post the results here if people are interested. Could you send me the code?, then I'll look into benchmarking my module too.
[OT] Re: mod_perl shared memory with MM
On Sun, Mar 11, 2001 at 03:33:12PM +0100, Christian Jaeger wrote: I've looked at Cache::FileCache now and think it's (currently) not possible to use for IPC::FsSharevars: I really miss locking capabilities. Imagine a script that reads a value at the beginning of a request and writes it back at the end of the request. If it's not locked during this time, another instance can read the same value and then write another change back which is then overwritten by the first instance. I'm very intrigued by your thinking on locking. I had never considered the transaction based approach to caching you are referring to. I'll take this up privately with you, because we've strayed far off the mod_perl topic, although I find it fascinating. - why don't you use 'real' constants for $SUCCESS and the like? (use constant) Two reasons, mostly historical, and not necessarily good ones. One, I benchmarked some code once that required high performance, and the use of constants was just slightly slower. Two, I like the syntax $hash{$CONSTANT}. If I remember correctly, $hash{CONSTANT} didn't work. This may have changed in newer versions of Perl. Obviously those are *very* small issues, and so it is mostly by habit that I don't use constant. I would consider changing, but it would mean asking everyone using the code to change too, because they currently import and use the constants as Exported scalars. Do you know of a very important reason to break compatibility and force the switch? I'm not opposed to switching if I have to, but I'd like to minimize the impact on the users. - you probably should either append the userid of the process to /tmp/FileCache or make this folder globally writeable (and set the sticky flag). Otherwise other users get a permission error. As of version 0.03, the cache directories, but not the cache entries, are globally writable by default. Users can override this by changing the 'directory_umask' option, or keep data private altogether by changing the 'cache_root'. What version did you test with? There may be a bug in there. - why don't you use Storable.pm? It should be much faster than Data::Dumper The TODO contains "Replace Data::Dumper with Storable (maybe)". :) The old File::Cache module used Storable, btw. It will be trivial to port the new Cache::FileCache to use Storable. I simply wanted to wait until I had the benchmarking code so I could be sure that Storeable was faster. Actually, I'm not 100% sure that I expect Storeable to be faster than Data::Dumper. If Data::Dumper turns out to be about equally fast, then I'll stay with it, because it is available on all Perl installations, I believe. Do you know if Storeable is definitely faster? If you have benchmarks then I am more than happy to switch now. Or, do you know of a reason, feature wise, that I should switch? Again, it is trivial to do so. I have some preliminary benchmark code -- only good for relative benchmarking, but it is a start. I'd be happy to post the results here if people are interested. Could you send me the code?, then I'll look into benchmarking my module too. I checked it in as Cache::CacheBenchmark. It isn't good code, nor does it necessarily work just yet. I simply checked it in while I was in the middle of working on it. I'm turning it into a real benchmarking class for the cache, and hopefully that will help you a little bit. Cheers, -DeWitt
Re: [OT] Re: mod_perl shared memory with MM
I'm very intrigued by your thinking on locking. I had never considered the transaction based approach to caching you are referring to. I'll take this up privately with you, because we've strayed far off the mod_perl topic, although I find it fascinating. One more suggestion before you take this off the list: it's nice to have both. There are uses for explicit locking (I remember Randal saying he wished File::Cache had some locking support), but most people will be happy with atomic updates, and that's usually faster. Gunther's eXtropia stuff supports various locking options, and you can read some of the reasoning behind it in the docs at http://new.extropia.com/development/webware2/webware2.html. (See chapters 13 and 18.) - why don't you use 'real' constants for $SUCCESS and the like? (use constant) Two reasons, mostly historical, and not necessarily good ones. One, I benchmarked some code once that required high performance, and the use of constants was just slightly slower. Ick. Two, I like the syntax $hash{$CONSTANT}. If I remember correctly, $hash{CONSTANT} didn't work. This may have changed in newer versions of Perl. No, the use of constants as hash keys or in interpolated strings still doesn't work. I tried the constants module in my last project, and I found it to be more trouble than it was worth. It's annoying to have to write things like $hash{CONSTANT} or "string @{[CONSTANT]}". Do you know if Storeable is definitely faster? It is, and it's now part of the standard distribution. http://www.astray.com/pipermail/foo/2000-August/000169.html - Perrin
Re: [OT] Re: mod_perl shared memory with MM
DeWitt Clinton wrote: On Sun, Mar 11, 2001 at 03:33:12PM +0100, Christian Jaeger wrote: I've looked at Cache::FileCache now and think it's (currently) not possible to use for IPC::FsSharevars: I really miss locking capabilities. Imagine a script that reads a value at the beginning of a request and writes it back at the end of the request. If it's not locked during this time, another instance can read the same value and then write another change back which is then overwritten by the first instance. I'm very intrigued by your thinking on locking. I had never considered the transaction based approach to caching you are referring to. I'll take this up privately with you, because we've strayed far off the mod_perl topic, although I find it fascinating. - why don't you use 'real' constants for $SUCCESS and the like? (use constant) Two reasons, mostly historical, and not necessarily good ones. One, I benchmarked some code once that required high performance, and the use of constants was just slightly slower. Two, I like the syntax $hash{$CONSTANT}. If I remember correctly, $hash{CONSTANT} didn't work. This may have changed in newer versions of Perl. Obviously those are *very* small issues, and so it is mostly by habit that I don't use constant. I would consider changing, but it would mean asking everyone using the code to change too, because they currently import and use the constants as Exported scalars. Do you know of a very important reason to break compatibility and force the switch? I'm not opposed to switching if I have to, but I'd like to minimize the impact on the users. - you probably should either append the userid of the process to /tmp/FileCache or make this folder globally writeable (and set the sticky flag). Otherwise other users get a permission error. As of version 0.03, the cache directories, but not the cache entries, are globally writable by default. Users can override this by changing the 'directory_umask' option, or keep data private altogether by changing the 'cache_root'. What version did you test with? There may be a bug in there. - why don't you use Storable.pm? It should be much faster than Data::Dumper The TODO contains "Replace Data::Dumper with Storable (maybe)". :) The old File::Cache module used Storable, btw. It will be trivial to port the new Cache::FileCache to use Storable. I simply wanted to wait until I had the benchmarking code so I could be sure that Storeable was faster. Actually, I'm not 100% sure that I expect Storeable to be faster than Data::Dumper. If Data::Dumper turns out to be about equally fast, then I'll stay with it, because it is available on all Perl installations, I believe. Do you know if Storeable is definitely faster? If you have benchmarks then I am more than happy to switch now. Or, do you know of a reason, feature wise, that I should switch? Again, it is trivial to do so. I've found it to be arround 5 - 10 % faster - on simple stuff on some benchmarking I did arround a year ago. Can I ask why you are not useing IPC::Sharedlight (as its pure C and apparently much faster than IPC::Shareable - I've never benchmarked it as I've also used IPC::Sharedlight). Greg I have some preliminary benchmark code -- only good for relative benchmarking, but it is a start. I'd be happy to post the results here if people are interested. Could you send me the code?, then I'll look into benchmarking my module too. I checked it in as Cache::CacheBenchmark. It isn't good code, nor does it necessarily work just yet. I simply checked it in while I was in the middle of working on it. I'm turning it into a real benchmarking class for the cache, and hopefully that will help you a little bit. Cheers, -DeWitt
Re: [OT] Re: mod_perl shared memory with MM
Can I ask why you are not useing IPC::Sharedlight (as its pure C and apparently much faster than IPC::Shareable - I've never benchmarked it as I've also used IPC::Sharedlight). Full circle back to the original topic... IPC::MM is implemented in C and offers an actual hash interface backed by a BTree in shared memory. IPC::ShareLite only works for individual scalars. It wouldn't surprise me if a file system approach was faster than either of these on Linux, because of the agressive caching. - Perrin
Re: [OT] Re: mod_perl shared memory with MM
Perrin Harkins wrote: Can I ask why you are not useing IPC::Sharedlight (as its pure C and apparently much faster than IPC::Shareable - I've never benchmarked it as I've also used IPC::Sharedlight). Full circle back to the original topic... IPC::MM is implemented in C and offers an actual hash interface backed by a BTree in shared memory. IPC::ShareLite only works for individual scalars. Not tried that one ! I'ce used the obvious Sharedlight plus Storable to serialise hashes. It wouldn't surprise me if a file system approach was faster than either of these on Linux, because of the agressive caching. I would be an interesting benchmark ... Althought it may only be a performance win on a lightly loaded machine,the assymption being that the stat'ing is fast on a lowly loaded system with fast understressed disks. I could be completly wrong here tho ;-). Has anyone used the file system approach on a RAM disk ? Greg - Perrin
Re: mod_perl shared memory with MM
On Sat, 10 Mar 2001, Christian Jaeger wrote: For all of you trying to share session information efficently my IPC::FsSharevars module might be the right thing. I wrote it after having considered all the other solutions. It uses the file system directly (no BDB/etc. overhead) and provides sophisticated locking (even different variables from the same session can be written at the same time). Sounds very interesting. Does it use a multi-file approach like File::Cache? Have you actually benchmarked it against BerkeleyDB? It's hard to beat BDB because it uses a shared memory buffer, but theoretically the file system buffer could do it since that's managed by the kernel. - Perrin
Re: mod_perl shared memory with MM
At 0:23 Uhr -0800 10.3.2001, Perrin Harkins wrote: On Sat, 10 Mar 2001, Christian Jaeger wrote: For all of you trying to share session information efficently my IPC::FsSharevars module might be the right thing. I wrote it after having considered all the other solutions. It uses the file system directly (no BDB/etc. overhead) and provides sophisticated locking (even different variables from the same session can be written at the same time). Sounds very interesting. Does it use a multi-file approach like File::Cache? Have you actually benchmarked it against BerkeleyDB? It's hard to beat BDB because it uses a shared memory buffer, but theoretically the file system buffer could do it since that's managed by the kernel. Yes, it uses a separate file for each variable. This way also locking is solved, each variable has it's own file lock. It's a bit difficult to write a realworld benchmark. I've tried to use DB_File before but it was very slow when doing a sync after every write as is recommended in various documentation to make it multiprocess safe. What do you mean with BerkeleyDB, something different than DB_File? Currently I don't use Mmap (are there no cross platform issues using that?), that might speed it up a bit more. Christian.
Re: mod_perl shared memory with MM
Christian Jaeger wrote: Yes, it uses a separate file for each variable. This way also locking is solved, each variable has it's own file lock. You should take a look at DeWitt Clinton's Cache::FileCache module, announced on this list. It might make sense to merge your work into that module, which is the next generation of the popular File::Cache module. It's a bit difficult to write a realworld benchmark. It certainly is. Benchmarking all of the options is something that I've always wanted to do and never find enough time for. I've tried to use DB_File before but it was very slow when doing a sync after every write as is recommended in various documentation to make it multiprocess safe. What do you mean with BerkeleyDB, something different than DB_File? BerkeleyDB.pm is an interface to later versions of the Berkeley DB library. It has a shared memory cache, and does not require syncing or opening and closing of files on every access. It has built-in locking, which can be configured to work at a page level, allowing mutiple simultaneous writers. Currently I don't use Mmap (are there no cross platform issues using that?), that might speed it up a bit more. That would be a nice option. Take a look at Cache::Mmap before you start. - Perrin
Re: mod_perl shared memory with MM
On Sat, Mar 10, 2001 at 04:35:02PM -0800, Perrin Harkins wrote: Christian Jaeger wrote: Yes, it uses a separate file for each variable. This way also locking is solved, each variable has it's own file lock. You should take a look at DeWitt Clinton's Cache::FileCache module, announced on this list. It might make sense to merge your work into that module, which is the next generation of the popular File::Cache module. Yes! I'm actively looking for additional developers for the Perl Cache project. I'd love new implementations of the Cache interface. Cache::BerkeleyDBCache would be wonderful. Check out: http://sourceforge.net/projects/perl-cache/ For what it is worth, I don't explicitly lock. I do atomic writes instead, and have yet to hear anyone report a problem in the year the code has been public. It's a bit difficult to write a realworld benchmark. It certainly is. Benchmarking all of the options is something that I've always wanted to do and never find enough time for. I have some preliminary benchmark code -- only good for relative benchmarking, but it is a start. I'd be happy to post the results here if people are interested. -DeWitt
Re: mod_perl shared memory with MM
I have some preliminary benchmark code -- only good for relative benchmarking, but it is a start. I'd be happy to post the results here if people are interested. Please do. - Perrin
Re: mod_perl shared memory with MM
For all of you trying to share session information efficently my IPC::FsSharevars module might be the right thing. I wrote it after having considered all the other solutions. It uses the file system directly (no BDB/etc. overhead) and provides sophisticated locking (even different variables from the same session can be written at the same time). I wrote it for my fastcgi based web app framework (Eile) but it should be useable for mod_perl things as well (I'm awaiting patches and suggestions in case it is not). It has not seen very much real world testing yet. You may find the manpage on http://testwww.ethz.ch/perldoc/IPC/FsSharevars.pm and the module (no Makefile.PL yet) under http://testwww.ethz.ch/eile/download/ . Cheers Christian.
Re: mod_perl shared memory with MM
Adi Fairbank wrote: Yeah, I was thinking about something like that at first, but I've never played with named pipes, and it didn't sound too safe after reading the perlipc man page. What do you use, Perl open() calls, IPC::Open2/3, IPC::ChildSafe, or IPC:ChildSafe is a good module, I use it here to access ClearCase, but it probably won't help you to exchange any data between Apache children
Re: mod_perl shared memory with MM
Sean Chittenden wrote: Is there a way you can do that without using Storable? Right after I sent the message, I was thinking to myself that same question... If I extended IPC::MM, how could I get it to be any faster than Storable already is? You can also read in the data you want in a startup.pl file and put the info in a hash in a global memory space (MyApp::datastruct{}) that gets shared through forking (copy on write, not read, right?). If the data is read only, and only a certain size, this option has worked _very_ well for me in the past. -sc Yeah, I do use that method for all my read-only data, but by definition the persistent session cache is *not* read-only... it gets changed on pretty much every request. -Adi
Re: mod_perl shared memory with MM
Adi Fairbank wrote: I am trying to squeeze more performance out of my persistent session cache. In my application, the Storable image size of my sessions can grow upwards of 100-200K. It can take on the order of 200ms for Storable to deserialize and serialize this on my (lousy) hardware. Its a different approach, but I use simple MLDBM + SDBM_File when possible, as its really fast for small records, but it has that 1024 byte limit per record! I am releasing a wrapper to CPAN ( on its way now ) called MLDBM::Sync that handles concurrent locking i/o flushing for you. One advantage of this approach is that your session state will persist through a server reboot if its written to disk. I also wrote a wrapper for SDBM_File called MLDBM::Sync::SDBM_File that overcomes the 1024 byte limit per record. The below numbers were for a benchmark on my dual PIII 450, linux 2.2.14, SCSI raid-1 ext2 fs mounted async. The benchmark can be found in the MLDBM::Sync package in the bench directory once it makes it to CPAN. With MLDBM ( perldoc MLDBM ) you can use Storable or XS Data::Dumper method for serialization as well as various DBMs. --Josh === INSERT OF 50 BYTE RECORDS === Time for 100 write/read's for SDBM_File 0.12 seconds 12288 bytes Time for 100 write/read's for MLDBM::Sync::SDBM_File 0.14 seconds 12288 bytes Time for 100 write/read's for GDBM_File 2.07 seconds 18066 bytes Time for 100 write/read's for DB_File 2.48 seconds 20480 bytes === INSERT OF 500 BYTE RECORDS === Time for 100 write/read's for SDBM_File 0.21 seconds 658432 bytes Time for 100 write/read's for MLDBM::Sync::SDBM_File 0.51 seconds 135168 bytes Time for 100 write/read's for GDBM_File 2.29 seconds 63472 bytes Time for 100 write/read's for DB_File 2.44 seconds 114688 bytes === INSERT OF 5000 BYTE RECORDS === (skipping test for SDBM_File 1024 byte limit) Time for 100 write/read's for MLDBM::Sync::SDBM_File 1.30 seconds 2101248 bytes Time for 100 write/read's for GDBM_File 2.55 seconds 832400 bytes Time for 100 write/read's for DB_File 3.27 seconds 839680 bytes === INSERT OF 2 BYTE RECORDS === (skipping test for SDBM_File 1024 byte limit) Time for 100 write/read's for MLDBM::Sync::SDBM_File 4.54 seconds 13162496 bytes Time for 100 write/read's for GDBM_File 5.39 seconds 2063912 bytes Time for 100 write/read's for DB_File 4.79 seconds 2068480 bytes
Re: mod_perl shared memory with MM
Is there a way you can do that without using Storable? Right after I sent the message, I was thinking to myself that same question... If I extended IPC::MM, how could I get it to be any faster than Storable already is? You can also read in the data you want in a startup.pl file and put the info in a hash in a global memory space (MyApp::datastruct{}) that gets shared through forking (copy on write, not read, right?). If the data is read only, and only a certain size, this option has worked _very_ well for me in the past. -sc -- Sean Chittenden[EMAIL PROTECTED] C665 A17F 9A56 286C 5CFB 1DEA 9F4F 5CEF 1EDD FAAD PGP signature
Re: mod_perl shared memory with MM
The night of Fat Tuesday no less... that didn't help any either. ::sigh:: Here's one possibility that I've done in the past becuase I needed mod_perl sessions to be able to talk with non-mod_perl programs. I setup a named bi-directional pipe that let you write a query to it for session information, and it wrote back with whatever you were looking for. Given that this needed to support perl, java, and c, it worked _very_ well and was extremely fast. Something you may also want to consider because it keeps your session information outside of apache (incase of restart of apache, or desire to synchronize session information across multiple hosts). -sc On Wed, Feb 28, 2001 at 09:25:45PM -0500, Adi Fairbank wrote: Delivered-To: [EMAIL PROTECTED] Date: Wed, 28 Feb 2001 21:25:45 -0500 From: Adi Fairbank [EMAIL PROTECTED] X-Mailer: Mozilla 4.75 [en] (X11; U; Linux 2.2.14-5.0 i586) X-Accept-Language: en To: Sean Chittenden [EMAIL PROTECTED] Subject: Re: mod_perl shared memory with MM It's ok, I do that a lot, too. Usually right after I click "Send" is when I realize I forgot something or didn't think it through all the way. :) Sean Chittenden wrote: Hmm... yeah, whoops. I suppose that's what I get for sending email that late. :~) -sc -- Sean Chittenden[EMAIL PROTECTED] C665 A17F 9A56 286C 5CFB 1DEA 9F4F 5CEF 1EDD FAAD PGP signature
Re: mod_perl shared memory with MM
Sean, Yeah, I was thinking about something like that at first, but I've never played with named pipes, and it didn't sound too safe after reading the perlipc man page. What do you use, Perl open() calls, IPC::Open2/3, IPC::ChildSafe, or something else? How stable has it been for you? I just didn't like all those warnings in the IPC::Open2 and perlipc man pages. -Adi Sean Chittenden wrote: The night of Fat Tuesday no less... that didn't help any either. ::sigh:: Here's one possibility that I've done in the past becuase I needed mod_perl sessions to be able to talk with non-mod_perl programs. I setup a named bi-directional pipe that let you write a query to it for session information, and it wrote back with whatever you were looking for. Given that this needed to support perl, java, and c, it worked _very_ well and was extremely fast. Something you may also want to consider because it keeps your session information outside of apache (incase of restart of apache, or desire to synchronize session information across multiple hosts). -sc
mod_perl shared memory with MM
I am trying to squeeze more performance out of my persistent session cache. In my application, the Storable image size of my sessions can grow upwards of 100-200K. It can take on the order of 200ms for Storable to deserialize and serialize this on my (lousy) hardware. I'm looking at RSE's MM and the Perl module IPC::MM as a persistent session cache. Right now IPC::MM doesn't support multi-dimensional Perl data structures, nor blessed references, so I will have to extend it to support these. My question is: is anyone else using IPC::MM under mod_perl? .. would you if it supported multi-dimensional Perl data? My other question is: since this will be somewhat moot once Apache 2.0 + mod_perl 2.0 are stable, is it worth the effort? What's the ETA on mod_perl 2.0? Should I spend my effort helping with that instead? Any comments appreciated, -Adi
Re: mod_perl shared memory with MM
Adi Fairbank wrote: I am trying to squeeze more performance out of my persistent session cache. In my application, the Storable image size of my sessions can grow upwards of 100-200K. It can take on the order of 200ms for Storable to deserialize and serialize this on my (lousy) hardware. I'm looking at RSE's MM and the Perl module IPC::MM as a persistent session cache. Right now IPC::MM doesn't support multi-dimensional Perl data structures, nor blessed references, so I will have to extend it to support these. Is there a way you can do that without using Storable? If not, maybe you should look at partitioning your data more, so that only the parts you really need for a given request are loaded and saved. I'm pleased to see people using IPC::MM, since I bugged Arthur to put it on CPAN. However, if it doesn't work for you there are other options such as BerkeleyDB (not DB_File) which should provide a similar level of performance. - Perrin
Re: mod_perl shared memory with MM
Perrin Harkins wrote: Adi Fairbank wrote: I am trying to squeeze more performance out of my persistent session cache. In my application, the Storable image size of my sessions can grow upwards of 100-200K. It can take on the order of 200ms for Storable to deserialize and serialize this on my (lousy) hardware. I'm looking at RSE's MM and the Perl module IPC::MM as a persistent session cache. Right now IPC::MM doesn't support multi-dimensional Perl data structures, nor blessed references, so I will have to extend it to support these. Is there a way you can do that without using Storable? Right after I sent the message, I was thinking to myself that same question... If I extended IPC::MM, how could I get it to be any faster than Storable already is? Basically what I came up with off the top of my head was to try to map each Perl hash to a mm_hash and each Perl array to a mm_btree_table, all the way down through the multi-level data structure. Every time you add a hashref to your tied IPC::MM hash, it would create a new mm_hash and store the reference to that child in the parent. Ditto for arrayrefs, but use mm_btree_table. If this is possible, then you could operate on the guts of a deep data structure without completely serializing and deserializing it every time. If not, maybe you should look at partitioning your data more, so that only the parts you really need for a given request are loaded and saved. Good idea! That would save a lot of speed, and would be easy to do with my design. Silly I didn't think of that. I'm pleased to see people using IPC::MM, since I bugged Arthur to put it on CPAN. However, if it doesn't work for you there are other options such as BerkeleyDB (not DB_File) which should provide a similar level of performance. Thanks.. I'll look at BerkeleyDB. -Adi
Re: Fwd: [speedycgi] Speedycgi scales better than mod_perl withsc ripts that contain un-shared memory
Sam Horrocks wrote: say they take two slices, and interpreters 1 and 2 get pre-empted and go back into the queue. So then requests 5/6 in the queue have to use other interpreters, and you expand the number of interpreters in use. But still, you'll wind up using the smallest number of interpreters required for the given load and timeslice. As soon as those 1st and 2nd perl interpreters finish their run, they go back at the beginning of the queue, and the 7th/ 8th or later requests can then use them, etc. Now you have a pool of maybe four interpreters, all being used on an MRU basis. But it won't expand beyond that set unless your load goes up or your program's CPU time requirements increase beyond another timeslice. MRU will ensure that whatever the number of interpreters in use, it is the lowest possible, given the load, the CPU-time required by the program and the size of the timeslice. You know, I had brief look through some of the SpeedyCGI code yesterday, and I think the MRU process selection might be a bit of a red herring. I think the real reason Speedy won the memory test is the way it spawns processes. If I understand what's going on in Apache's source, once every second it has a look at the scoreboard and says "less than MinSpareServers are idle, so I'll start more" or "more than MaxSpareServers are idle, so I'll kill one". It only kills one per second. It starts by spawning one, but the number spawned goes up exponentially each time it sees there are still not enough idle servers, until it hits 32 per second. It's easy to see how this could result in spawning too many in response to sudden load, and then taking a long time to clear out the unnecessary ones. In contrast, Speedy checks on every request to see if there are enough backends running. If there aren't, it spawns more until there are as many backends as queued requests. That means it never overshoots the mark. Going back to your example up above, if Apache actually controlled the number of processes tightly enough to prevent building up idle servers, it wouldn't really matter much how processes were selected. If after the 1st and 2nd interpreters finish their run they went to the end of the queue instead of the beginning of it, that simply means they will sit idle until called for instead of some other two processes sitting idle until called for. If the systems were both efficient enough about spawning to only create as many interpreters as needed, none of them would be sitting idle and memory usage would always be as low as possible. I don't know if I'm explaining this very well, but the gist of my theory is that at any given time both systems will require an equal number of in use interpreters to do an equal amount of work and the diffirentiator between the two is Apache's relatively poor estimate of how many processes should be available at any given time. I think this theory matches up nicely with the results of Sam's tests: when MaxClients prevents Apache from spawning too many processes, both systems have similar performance characteristics. There are some knobs to twiddle in Apache's source if anyone is interested in playing with it. You can change the frequency of the checks and the maximum number of servers spawned per check. I don't have much motivation to do this investigation myself, since I've already tuned our MaxClients and process size constraints to prevent problems with our application. - Perrin
Re: Fwd: [speedycgi] Speedycgi scales better than mod_perl withsc ripts that contain un-shared memory
There's only one run queue in the kernel. THe first task ready to run is put at the head of that queue, and anything arriving afterwards waits. Only if that first task blocks on a resource or takes a very long time, or a higher priority process becomes able to run due to an interrupt is that process taken out of the queue. Note that any I/O request that isn't completely handled by buffers will trigger the 'blocks on a resource' clause above, which means that jobs doing any real work will complete in an order determined by something other than the cpu and not strictly serialized. Also, most of my web servers are dual-cpu so even cpu bound processes may complete out of order. I think it's much easier to visualize how MRU helps when you look at one thing running at a time. And MRU works best when every process runs to completion instead of blocking, etc. But even if the process gets timesliced, blocked, etc, MRU still degrades gracefully. You'll get more processes in use, but still the numbers will remain small. Similarly, because of the non-deterministic nature of computer systems, Apache doesn't service requests on an LRU basis; you're comparing SpeedyCGI against a straw man. Apache's servicing algortihm approaches randomness, so you need to build a comparison between forced-MRU and random choice. Apache httpd's are scheduled on an LRU basis. This was discussed early in this thread. Apache uses a file-lock for its mutex around the accept call, and file-locking is implemented in the kernel using a round-robin (fair) selection in order to prevent starvation. This results in incoming requests being assigned to httpd's in an LRU fashion. But, if you are running a front/back end apache with a small number of spare servers configured on the back end there really won't be any idle perl processes during the busy times you care about. That is, the backends will all be running or apache will shut them down and there won't be any difference between MRU and LRU (the difference would be which idle process waits longer - if none are idle there is no difference). If you can tune it just right so you never run out of ram, then I think you could get the same performance as MRU on something like hello-world. Once the httpd's get into the kernel's run queue, they finish in the same order they were put there, unless they block on a resource, get timesliced or are pre-empted by a higher priority process. Which means they don't finish in the same order if (a) you have more than one cpu, (b) they do any I/O (including delivering the output back which they all do), or (c) some of them run long enough to consume a timeslice. Try it and see. I'm sure you'll run more processes with speedycgi, but you'll probably run a whole lot fewer perl interpreters and need less ram. Do you have a benchmark that does some real work (at least a dbm lookup) to compare against a front/back end mod_perl setup? No, but if you send me one, I'll run it.
RE: Fwd: [speedycgi] Speedycgi scales better than mod_perl withsc ripts that contain un-shared memory
This doesn't affect the argument, because the core of it is that: a) the CPU will not completely process a single task all at once; instead, it will divide its time _between_ the tasks b) tasks do not arrive at regular intervals c) tasks take varying amounts of time to complete [snip] I won't agree with (a) unless you qualify it further - what do you claim is the method or policy for (a)? I think this has been answered ... basically, resource conflicts (including I/O), interrupts, long running tasks, higher priority tasks, and, of course, the process yielding, can all cause the CPU to switch processes (which of these qualify depends very much on the OS in question). This is why, despite the efficiency of single-task running, you can usefully run more than one process on a UNIX system. Otherwise, if you ran a single Apache process and had no traffic, you couldn't run a shell at the same time - Apache would consume practically all your CPU in its select() loop 8-) Apache httpd's are scheduled on an LRU basis. This was discussed early in this thread. Apache uses a file-lock for its mutex around the accept call, and file-locking is implemented in the kernel using a round-robin (fair) selection in order to prevent starvation. This results in incoming requests being assigned to httpd's in an LRU fashion. I'll apologise, and say, yes, of course you're right, but I do have a query: There are at (IIRC) 5 methods that Apache uses to serialize requests: fcntl(), flock(), Sys V semaphores, uslock (IRIX only) and Pthreads (reliably only on Solaris). Do they _all_ result in LRU? Remember that the httpd's in the speedycgi case will have very little un-shared memory, because they don't have perl interpreters in them. So the processes are fairly indistinguishable, and the LRU isn't as big a penalty in that case. Ye_but_, interpreter for interpreter, won't the equivalent speedycgi have roughly as much unshared memory as the mod_perl? I've had a lot of (dumb) discussions with people who complain about the size of Apache+mod_perl without realising that the interpreter code's all shared, and with pre-loading a lot of the perl code can be too. While I _can_ see speedycgi having an advantage (because it's got a much better overview of what's happening, and can intelligently manage the situation), I don't think it's as large as you're suggesting. I think this needs to be intensively benchmarked to answer that other interpreters, and you expand the number of interpreters in use. But still, you'll wind up using the smallest number of interpreters required for the given load and timeslice. As soon as those 1st and 2nd perl interpreters finish their run, they go back at the beginning of the queue, and the 7th/ 8th or later requests can then use them, etc. Now you have a pool of maybe four interpreters, all being used on an MRU basis. But it won't expand beyond that set unless your load goes up or your program's CPU time requirements increase beyond another timeslice. MRU will ensure that whatever the number of interpreters in use, it is the lowest possible, given the load, the CPU-time required by the program and the size of the timeslice. Yep...no arguments here. SpeedyCGI should result in fewer interpreters. I will say that there are a lot of convincing reasons to follow the SpeedyCGI model rather than the mod_perl model, but I've generally thought that the increase in that kind of performance that can be obtained as sufficiently minimal as to not warrant the extra layer... thoughts, anyone? Stephen.
RE: Fwd: [speedycgi] Speedycgi scales better than mod_perl withsc ripts that contain un-shared memory
There seems to be a lot of talk here, and analogies, and zero real-world benchmarking. Now it seems to me from reading this thread, that speedycgi would be better where you run 1 script, or only a few scripts, and mod_perl might win where you have a large application with hundreds of different URLs with different code being executed on each. That may change with the next release of speedy, but then lots of things will change with the next major release of mod_perl too, so its irrelevant until both are released. And as well as that, speedy still suffers (IMHO) that is still follows the CGI scripting model, whereas mod_perl offers a much more flexible environemt, and feature rich API (the Apache API). What's more, I could never build something like AxKit in speedycgi, without resorting to hacks like mod_rewrite to hide nasty URL's. At least thats my conclusion from first appearances. Either way, both solutions have their merits. Neither is going to totally replace the other. What I'd really like to do though is sum up this thread in a short article for take23. I'll see if I have time on Sunday to do it. -- Matt/ /||** Director and CTO ** //||** AxKit.com Ltd ** ** XML Application Serving ** // ||** http://axkit.org ** ** XSLT, XPathScript, XSP ** // \\| // ** Personal Web Site: http://sergeant.org/ ** \\// //\\ // \\
Re: Fwd: [speedycgi] Speedycgi scales better than mod_perl withsc ripts that contain un-shared memory
You know, I had brief look through some of the SpeedyCGI code yesterday, and I think the MRU process selection might be a bit of a red herring. I think the real reason Speedy won the memory test is the way it spawns processes. Please take a look at that code again. There's no smoke and mirrors, no red-herrings. Also, I don't look at the benchmarks as "winning" - I am not trying to start a mod_perl vs speedy battle here. Gunther wanted to know if there were "real bechmarks", so I reluctantly put them up. Here's how SpeedyCGI works (this is from version 2.02 of the code): When the frontend starts, it tries to quickly grab a backend from the front of the be_wait queue, which is a LIFO. This is in speedy_frontend.c, get_a_backend() function. If there aren't any idle be's, it puts itself onto the fe_wait queue. Same file, get_a_backend_hard(). If this fe (frontend) is at the front of the fe_wait queue, it "takes charge" and starts looking to see if a backend needs to be spawned. This is part of the "frontend_ping()" function. It will only spawn a be if no other backends are being spawned, so only one backend gets spawned at a time. Every frontend in the queue, drops into a sigsuspend and waits for an alarm signal. The alarm is set for 1-second. This is also in get_a_backend_hard(). When a backend is ready to handle code, it goes and looks at the fe_wait queue and if there are fe's there, it sends a SIGALRM to the one at the front, and sets the sent_sig flag for that fe. This done in speedy_group.c, speedy_group_sendsigs(). When a frontend wakes on an alarm (either due to a timeout, or due to a be waking it up), it looks at its sent_sig flag to see if it can now grab a be from the queue. If so it does that. If not, it runs various checks then goes back to sleep. In most cases, you should get a be from the lifo right at the beginning in the get_a_backend() function. Unless there aren't enough be's running, or somethign is killing them (bad perl code), or you've set the MaxBackends option to limit the number of be's. If I understand what's going on in Apache's source, once every second it has a look at the scoreboard and says "less than MinSpareServers are idle, so I'll start more" or "more than MaxSpareServers are idle, so I'll kill one". It only kills one per second. It starts by spawning one, but the number spawned goes up exponentially each time it sees there are still not enough idle servers, until it hits 32 per second. It's easy to see how this could result in spawning too many in response to sudden load, and then taking a long time to clear out the unnecessary ones. In contrast, Speedy checks on every request to see if there are enough backends running. If there aren't, it spawns more until there are as many backends as queued requests. Speedy does not check on every request to see if there are enough backends running. In most cases, the only thing the frontend does is grab an idle backend from the lifo. Only if there are none available does it start to worry about how many are running, etc. That means it never overshoots the mark. You're correct that speedy does try not to overshoot, but mainly because there's no point in overshooting - it just wastes swap space. But that's not the heart of the mechanism. There truly is a LIFO involved. Please read that code again, or run some tests. Speedy could overshoot by far, and the worst that would happen is that you would get a lot of idle backends sitting in virtual memory, which the kernel would page out, and then at some point they'll time out and die. Unless of course the load increases to a point where they're needed, in which case they would get used. If you have speedy installed, you can manually start backends yourself and test. Just run "speedy_backend script.pl " to start a backend. If you start lots of those on a script that says 'print "$$\n"', then run the frontend on the same script, you will still see the same pid over and over. This is the LIFO in action, reusing the same process over and over. Going back to your example up above, if Apache actually controlled the number of processes tightly enough to prevent building up idle servers, it wouldn't really matter much how processes were selected. If after the 1st and 2nd interpreters finish their run they went to the end of the queue instead of the beginning of it, that simply means they will sit idle until called for instead of some other two processes sitting idle until called for. If the systems were both efficient enough about spawning to only create as many interpreters as needed, none of them would be sitting idle and memory usage would always be as low as possible. I don't know if I'm explaining this very well, but the gist of my theory is that at any given time both
Re: Fwd: [speedycgi] Speedycgi scales better than mod_perl withscripts that contain un-shared memory
On Fri, 19 Jan 2001, Sam Horrocks wrote: You know, I had brief look through some of the SpeedyCGI code yesterday, and I think the MRU process selection might be a bit of a red herring. I think the real reason Speedy won the memory test is the way it spawns processes. Please take a look at that code again. There's no smoke and mirrors, no red-herrings. I didn't mean that MRU isn't really happening, just that it isn't the reason why Speedy is running fewer interpeters. Also, I don't look at the benchmarks as "winning" - I am not trying to start a mod_perl vs speedy battle here. Okay, but let's not be so polite about things that we don't acknowledge when someone is onto a better way of doing things. Stealing good ideas from other projects is a time-honored open source tradition. Speedy does not check on every request to see if there are enough backends running. In most cases, the only thing the frontend does is grab an idle backend from the lifo. Only if there are none available does it start to worry about how many are running, etc. Sorry, I had a lot of the details about what Speedy is doing wrong. However, it still sounds like it has a more efficient approach than Apache in terms of managing process spawning. You're correct that speedy does try not to overshoot, but mainly because there's no point in overshooting - it just wastes swap space. But that's not the heart of the mechanism. There truly is a LIFO involved. Please read that code again, or run some tests. Speedy could overshoot by far, and the worst that would happen is that you would get a lot of idle backends sitting in virtual memory, which the kernel would page out, and then at some point they'll time out and die. When you spawn a new process it starts out in real memory, doesn't it? Spawning too many could use up all the physical RAM and send a box into swap, at least until it managed to page out the idle processes. That's what I think happened to mod_perl in this test. If you start lots of those on a script that says 'print "$$\n"', then run the frontend on the same script, you will still see the same pid over and over. This is the LIFO in action, reusing the same process over and over. Right, but I don't think that explains why fewer processes are running. Suppose you start 10 processes, and then send in one request at a time, and that request takes one time slice to complete. If MRU works perfectly, you'll get process 1 over and over again handling the requests. LRU will use process 1, then 2, then 3, etc. But both of them have 9 processes idle and one in use at any given time. The 9 idle ones should either be killed off, or ideally never have been spawned in the first place. I think Speedy does a better job of preventing unnecessary process spawning. One alternative theory is that keeping the same process busy instead of rotating through all 10 means that the OS can page out the other 9 and thus use less physical RAM. Anyway, I feel like we've been putting you on the spot, and I don't want you to feel obligated to respond personally to all the messages on this thread. I'm only still talking about it because it's interesting and I've learned a couple of things about Linux and Apache from it. If I get the chance this weekend, I'll try some tests of my own. - Perrin
Re: Fwd: [speedycgi] Speedycgi scales better than mod_perl withsc ripts that contain un-shared memory
- Original Message - From: "Sam Horrocks" [EMAIL PROTECTED] To: [EMAIL PROTECTED] Cc: "mod_perl list" [EMAIL PROTECTED]; "Stephen Anderson" [EMAIL PROTECTED] Sent: Thursday, January 18, 2001 10:38 PM Subject: Re: Fwd: [speedycgi] Speedycgi scales better than mod_perl withsc ripts that contain un-shared memory There's only one run queue in the kernel. THe first task ready to run is put at the head of that queue, and anything arriving afterwards waits. Only if that first task blocks on a resource or takes a very long time, or a higher priority process becomes able to run due to an interrupt is that process taken out of the queue. Note that any I/O request that isn't completely handled by buffers will trigger the 'blocks on a resource' clause above, which means that jobs doing any real work will complete in an order determined by something other than the cpu and not strictly serialized. Also, most of my web servers are dual-cpu so even cpu bound processes may complete out of order. Similarly, because of the non-deterministic nature of computer systems, Apache doesn't service requests on an LRU basis; you're comparing SpeedyCGI against a straw man. Apache's servicing algortihm approaches randomness, so you need to build a comparison between forced-MRU and random choice. Apache httpd's are scheduled on an LRU basis. This was discussed early in this thread. Apache uses a file-lock for its mutex around the accept call, and file-locking is implemented in the kernel using a round-robin (fair) selection in order to prevent starvation. This results in incoming requests being assigned to httpd's in an LRU fashion. But, if you are running a front/back end apache with a small number of spare servers configured on the back end there really won't be any idle perl processes during the busy times you care about. That is, the backends will all be running or apache will shut them down and there won't be any difference between MRU and LRU (the difference would be which idle process waits longer - if none are idle there is no difference). Once the httpd's get into the kernel's run queue, they finish in the same order they were put there, unless they block on a resource, get timesliced or are pre-empted by a higher priority process. Which means they don't finish in the same order if (a) you have more than one cpu, (b) they do any I/O (including delivering the output back which they all do), or (c) some of them run long enough to consume a timeslice. Try it and see. I'm sure you'll run more processes with speedycgi, but you'll probably run a whole lot fewer perl interpreters and need less ram. Do you have a benchmark that does some real work (at least a dbm lookup) to compare against a front/back end mod_perl setup? Remember that the httpd's in the speedycgi case will have very little un-shared memory, because they don't have perl interpreters in them. So the processes are fairly indistinguishable, and the LRU isn't as big a penalty in that case. This is why the original designers of Apache thought it was safe to create so many httpd's. If they all have the same (shared) memory, then creating a lot of them does not have much of a penalty. mod_perl applications throw a big monkey wrench into this design when they add a lot of unshared memory to the httpd's. This is part of the reason the front/back end mod_perl configuration works well, keeping the backend numbers low. The real win when serving over the internet, though, is that the perl memory is no longer tied up while delivering the output back over frequently slow connections. Les Mikesell [EMAIL PROTECTED]
Re: Fwd: [speedycgi] Speedycgi scales better than mod_perl withscripts that contain un-shared memory
lking about requests that were coming in with more time between them. Speedycgi will definitely use fewer interpreters in that case. This url: http://www.oreilly.com/catalog/linuxkernel/chapter/ch10.html says the default timeslice is 210ms (1/5th of a second) for Linux on a PC. There's also lots of good info there on Linux scheduling. I found that setting MaxClients to 100 stopped the paging. At concurrency level 100, both mod_perl and mod_speedycgi showed similar rates with ab. Even at higher levels (300), they were comparable. That's what I would expect if both systems have a similar limit of how many interpreters they can fit in RAM at once. Shared memory would help here, since it would allow more interpreters to run. By the way, do you limit the number of SpeedyCGI processes as well? it seems like you'd have to, or they'd start swapping too when you throw too many requests in. SpeedyCGI has an optional limit on the number of processes, but I didn't use it in my testing. But, to show that the underlying problem is still there, I then changed the hello_world script and doubled the amount of un-shared memory. And of course the problem then came back for mod_perl, although speedycgi continued to work fine. I think this shows that mod_perl is still using quite a bit more memory than speedycgi to provide the same service. I'm guessing that what happened was you ran mod_perl into swap again. You need to adjust MaxClients when your process size changes significantly. Right, but this also points out how difficult it is to get mod_perl tuning just right. My opinion is that the MRU design adapts more dynamically to the load. I believe that with speedycgi you don't have to lower the MaxClients setting, because it's able to handle a larger number of clients, at least in this test. Maybe what you're seeing is an ability to handle a larger number of requests (as opposed to clients) because of the performance benefit I mentioned above. I don't follow. When not all processes are in use, I think Speedy would handle requests more quickly, which would allow it to handle n requests in less time than mod_perl. Saying it handles more clients implies that the requests are simultaneous. I don't think it can handle more simultaneous requests. Don't agree. Are the speedycgi+Apache processes smaller than the mod_perl processes? If not, the maximum number of concurrent requests you can handle on a given box is going to be the same. The size of the httpds running mod_speedycgi, plus the size of speedycgi perl processes is significantly smaller than the total size of the httpd's running mod_perl. The reason for this is that only a handful of perl processes are required by speedycgi to handle the same load, whereas mod_perl uses a perl interpreter in all of the httpds. I think this is true at lower levels, but not when the number of simultaneous requests gets up to the maximum that the box can handle. At that point, it's a question of how many interpreters can fit in memory. I would expect the size of one Speedy + one httpd to be about the same as one mod_perl/httpd when no memory is shared. With sharing, you'd be able to run more processes. I'd agree that the size of one Speedy backend + one httpd would be the same or even greater than the size of one mod_perl/httpd when no memory is shared. But because the speedycgi httpds are small (no perl in them) and the number of SpeedyCGI perl interpreters is small, the total memory required is significantly smaller for the same load. Sam __ Gunther Birznieks ([EMAIL PROTECTED]) eXtropia - The Web Technology Company http://www.extropia.com/
Re: Fwd: [speedycgi] Speedycgi scales better than mod_perl withscripts that contain un-shared memory
ocess, etc), or a higher priority process pre-empts it, or it's taken so much time that the kernel wants to give another process a chance to run. - A set of requests can be considered "simultaneous" if they all arrive and start being handled in a period of time shorter than the time it takes to service a request. That sounds OK. Operating on these two assumptions, I say that 10 simultaneous requests will require 10 interpreters to service them. There's no way to handle them with fewer, unless you queue up some of the requests and make them wait. Right. And that waiting takes place: - In the mutex around the accept call in the httpd - In the kernel's run queue when the process is ready to run, but is waiting for other processes ahead of it. So, since there is only one CPU, then in both cases (mod_perl and SpeedyCGI), processes spend time waiting. But what happens in the case of SpeedyCGI is that while some of the httpd's are waiting, one of the earlier speedycgi perl interpreters has already finished its run through the perl code and has put itself back at the front of the speedycgi queue. And by the time that Nth httpd gets around to running, it can re-use that first perl interpreter instead of needing yet another process. This is why it's important that you don't assume that Unix is truly concurrent. I also say that if you have a top limit of 10 interpreters on your machine because of memory constraints, and you're sending in 10 simultaneous requests constantly, all interpreters will be used all the time. In that case it makes no difference to the throughput whether you use MRU or LRU. This is not true for SpeedyCGI, because of the reason I give above. 10 simultaneous requests will not necessarily require 10 interpreters. What you say would be true if you had 10 processors and could get true concurrency. But on single-cpu systems you usually don't need 10 unix processes to handle 10 requests concurrently, since they get serialized by the kernel anyways. I think the CPU slices are smaller than that. I don't know much about process scheduling, so I could be wrong. I would agree with you if we were talking about requests that were coming in with more time between them. Speedycgi will definitely use fewer interpreters in that case. This url: http://www.oreilly.com/catalog/linuxkernel/chapter/ch10.html says the default timeslice is 210ms (1/5th of a second) for Linux on a PC. There's also lots of good info there on Linux scheduling. I found that setting MaxClients to 100 stopped the paging. At concurrency level 100, both mod_perl and mod_speedycgi showed similar rates with ab. Even at higher levels (300), they were comparable. That's what I would expect if both systems have a similar limit of how many interpreters they can fit in RAM at once. Shared memory would help here, since it would allow more interpreters to run. By the way, do you limit the number of SpeedyCGI processes as well? it seems like you'd have to, or they'd start swapping too when you throw too many requests in. SpeedyCGI has an optional limit on the number of processes, but I didn't use it in my testing. But, to show that the underlying problem is still there, I then changed the hello_world script and doubled the amount of un-shared memory. And of course the problem then came back for mod_perl, although speedycgi continued to work fine. I think this shows that mod_perl is still using quite a bit more memory than speedycgi to provide the same service. I'm guessing that what happened was you ran mod_perl into swap again. You need to adjust MaxClients when your process size changes significantly. Right, but this also points out how difficult it is to get mod_perl tuning just right. My opinion is that the MRU design adapts more dynamically to the load. I believe that with speedycgi you don't have to lower the MaxClients setting, because it's able to handle a larger number of clients, at least in this test. Maybe what you're seeing is an ability to handle a larger number of requests (as opposed to clients) because of the performance benefit I mentioned above. I don't follow. When not all processes are in use, I think Speedy would handle requests more quickly, which would allow it to handle n requests in less time than mod_perl. Saying it handles more clients implies
Re: Fwd: [speedycgi] Speedycgi scales better than mod_perl withscripts that contain un-shared memory
I have a wide assortment of queries on a site, some of which take several minutes to execute, while others execute in less than one second. If understand this analogy correctly, I'd be better off with the current incarnation of mod_perl because there would be more cashiers around to serve the "quick cups of coffee" that many customers request at my dinner. Is this correct? Sam Horrocks wrote: I think the major problem is that you're assuming that just because there are 10 constant concurrent requests, that there have to be 10 perl processes serving those requests at all times in order to get maximum throughput. The problem with that assumption is that there is only one CPU - ten processes cannot all run simultaneously anyways, so you don't really need ten perl interpreters. I've been trying to think of better ways to explain this. I'll try to explain with an analogy - it's sort-of lame, but maybe it'll give you a mental picture of what's happening. To eliminate some confusion, this analogy doesn't address LRU/MRU, nor waiting on other events like network or disk i/o. It only tries to explain why you don't necessarily need 10 perl-interpreters to handle a stream of 10 concurrent requests on a single-CPU system. You own a fast-food restaurant. The players involved are: Your customers. These represent the http requests. Your cashiers. These represent the perl interpreters. Your cook. You only have one. THis represents your CPU. The normal flow of events is this: A cashier gets an order from a customer. The cashier goes and waits until the cook is free, and then gives the order to the cook. The cook then cooks the meal, taking 5-minutes for each meal. The cashier waits for the meal to be ready, then takes the meal and gives it to the customer. The cashier then serves another customer. The cashier/customer interaction takes a very small amount of time. The analogy is this: An http request (customer) arrives. It is given to a perl interpreter (cashier). A perl interpreter must wait for all other perl interpreters ahead of it to finish using the CPU (the cook). It can't serve any other requests until it finishes this one. When its turn arrives, the perl interpreter uses the CPU to process the perl code. It then finishes and gives the results over to the http client (the customer). Now, say in this analogy you begin the day with 10 customers in the store. At each 5-minute interval thereafter another customer arrives. So at time 0, there is a pool of 10 customers. At time +5, another customer arrives. At time +10, another customer arrives, ad infinitum. You could hire 10 cashiers in order to handle this load. What would happen is that the 10 cashiers would fairly quickly get all the orders from the first 10 customers simultaneously, and then start waiting for the cook. The 10 cashiers would queue up. Casher #1 would put in the first order. Cashiers 2-9 would wait their turn. After 5-minutes, cashier number 1 would receive the meal, deliver it to customer #1, and then serve the next customer (#11) that just arrived at the 5-minute mark. Cashier #1 would take customer #11's order, then queue up and wait in line for the cook - there will be 9 other cashiers already in line, so the wait will be long. At the 10-minute mark, cashier #2 would receive a meal from the cook, deliver it to customer #2, then go on and serve the next customer (#12) that just arrived. Cashier #2 would then go and wait in line for the cook. This continues on through all the cashiers in order 1-10, then repeating, 1-10, ad infinitum. Now even though you have 10 cashiers, most of their time is spent waiting to put in an order to the cook. Starting with customer #11, all customers will wait 50-minutes for their meal. When customer #11 comes in he/she will immediately get to place an order, but it will take the cashier 45-minutes to wait for the cook to become free, and another 5-minutes for the meal to be cooked. Same is true for customer #12, and all customers from then on. Now, the question is, could you get the same throughput with fewer cashiers? Say you had 2 cashiers instead. The 10 customers are there waiting. The 2 cashiers take orders from customers #1 and #2. Cashier #1 then gives the order to the cook and waits. Cashier #2 waits in line for the cook behind cashier #1. At the 5-minute mark, the first meal is done. Cashier #1 delivers the meal to customer #1, then serves customer #3. Cashier #1 then goes and stands in line behind cashier #2. At the 10-minute mark, cashier #2's meal is ready - it's delivered to customer #2 and then customer #4 is served. This continues on with the cashiers trading off between serving customers. Does the scenario with two cashiers go any more slowly than the one with 10 cashiers? No. When the 11th
Re: Fwd: [speedycgi] Speedycgi scales better than mod_perl withscripts that contain un-shared memory
using less RAM, except under light loads, or if the amount of shared memory is extremely large. If the total amount of RAM used by the mod_perl interpreters is high enough, your system will start paging, and your performance will nosedive. Given the same load speedycgi will just maintain the same performance because it's using less RAM. The thing is that if you know ahead of time what your load is going to be in the benchmark, you can reduce the number of httpd's so that mod_perl handles it with the same number of interpreters as speedycgi does. But how realistic that is in the real world, I don't know. With speedycgi it just sort of adapts to the load automatically. Maybe it would be possible to come up wiith a better benchmark that varies the load to show how speedycgi adapts better. Here are my results (perl == mod_perl, speedy == mod_speedycgi): * * Benchmarking perl * 3:05pm up 5 min, 3 users, load average: 0.04, 0.26, 0.15 This is ApacheBench, Version 1.3 Copyright (c) 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Copyright (c) 1998-1999 The Apache Group, http://www.apache.org/ Benchmarking localhost (be patient)... Server Software:Apache/1.3.9 Server Hostname:localhost Server Port:80 Document Path: /perl/hello_world Document Length:11 bytes Concurrency Level: 300 Time taken for tests: 30.022 seconds Complete requests: 2409 Failed requests:0 Total transferred: 411939 bytes HTML transferred: 26499 bytes Requests per second:80.24 Transfer rate: 13.72 kb/s received Connnection Times (ms) min avg max Connect:0 572 21675 Processing:30 1201 8301 Total: 30 1773 29976 This is ApacheBench, Version 1.3 Copyright (c) 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Copyright (c) 1998-1999 The Apache Group, http://www.apache.org/ Benchmarking localhost (be patient)... Server Software:Apache/1.3.9 Server Hostname:localhost Server Port:80 Document Path: /perl/hello_world Document Length:11 bytes Concurrency Level: 300 Time taken for tests: 41.872 seconds Complete requests: 524 Failed requests:0 Total transferred: 98496 bytes HTML transferred: 6336 bytes Requests per second:12.51 Transfer rate: 2.35 kb/s received Connnection Times (ms) min avg max Connect: 70 1679 8864 Processing: 300 7209 14728 Total:370 23592 * * Benchmarking speedy * 3:14pm up 3 min, 3 users, load average: 0.14, 0.31, 0.15 This is ApacheBench, Version 1.3 Copyright (c) 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Copyright (c) 1998-1999 The Apache Group, http://www.apache.org/ Benchmarking localhost (be patient)... Server Software:Apache/1.3.9 Server Hostname:localhost Server Port:80 Document Path: /speedy/hello_world Document Length:11 bytes Concurrency Level: 300 Time taken for tests: 30.175 seconds Complete requests: 6135 Failed requests:0 Total transferred: 1060713 bytes HTML transferred: 68233 bytes Requests per second:203.31 Transfer rate: 35.15 kb/s received Connnection Times (ms) min avg max Connect:0 179 9122 Processing:12 341 5710 Total: 12 520 14832 This is ApacheBench, Version 1.3 Copyright (c) 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Copyright (c) 1998-1999 The Apache Group, http://www.apache.org/ Benchmarking localhost (be patient)... Server Software:Apache/1.3.9 Server Hostname:localhost Server Port:80 Document Path: /speedy/hello_world Document Length:11 bytes Concurrency Level: 300 Time taken for tests: 30.327 seconds Complete requests: 7034 Failed requests:0 Total transferred: 1221795 bytes HTML transferred: 78595 bytes Requests per second:231.94 Transfer rate: 40.29 kb/s received Connnection Times (ms) min avg max Connect:0 237 9336 Processing: 215 405 12012 Total:215 642 21348 Here's the hello_world script: #!/usr/bin/speedy ## mod_perl/cgi program; iis/perl cgi; iis/perl isapi cgi use CGI; $x = 'x' x 65536; my $cgi = CGI-new(); print $cgi-header(); print "Hello "; print "World"; Here's the script I used to run the benchmarks: #!/bin/sh which=$1 echo "*" echo "* Benchmarking $which" echo "*" uptime httpd sleep 5 ab -t 30 -c 300 http://localhost/$
Re: Fwd: [speedycgi] Speedycgi scales better than mod_perl withscripts that contain un-shared memory
There is no coffee. Only meals. No substitutions. :-) If we added coffee to the menu it would still have to be prepared by the cook. Remember that you only have one CPU, and all the perl interpreters large and small must gain access to that CPU in order to run. Sam I have a wide assortment of queries on a site, some of which take several minutes to execute, while others execute in less than one second. If understand this analogy correctly, I'd be better off with the current incarnation of mod_perl because there would be more cashiers around to serve the "quick cups of coffee" that many customers request at my dinner. Is this correct? Sam Horrocks wrote: I think the major problem is that you're assuming that just because there are 10 constant concurrent requests, that there have to be 10 perl processes serving those requests at all times in order to get maximum throughput. The problem with that assumption is that there is only one CPU - ten processes cannot all run simultaneously anyways, so you don't really need ten perl interpreters. I've been trying to think of better ways to explain this. I'll try to explain with an analogy - it's sort-of lame, but maybe it'll give you a mental picture of what's happening. To eliminate some confusion, this analogy doesn't address LRU/MRU, nor waiting on other events like network or disk i/o. It only tries to explain why you don't necessarily need 10 perl-interpreters to handle a stream of 10 concurrent requests on a single-CPU system. You own a fast-food restaurant. The players involved are: Your customers. These represent the http requests. Your cashiers. These represent the perl interpreters. Your cook. You only have one. THis represents your CPU. The normal flow of events is this: A cashier gets an order from a customer. The cashier goes and waits until the cook is free, and then gives the order to the cook. The cook then cooks the meal, taking 5-minutes for each meal. The cashier waits for the meal to be ready, then takes the meal and gives it to the customer. The cashier then serves another customer. The cashier/customer interaction takes a very small amount of time. The analogy is this: An http request (customer) arrives. It is given to a perl interpreter (cashier). A perl interpreter must wait for all other perl interpreters ahead of it to finish using the CPU (the cook). It can't serve any other requests until it finishes this one. When its turn arrives, the perl interpreter uses the CPU to process the perl code. It then finishes and gives the results over to the http client (the customer). Now, say in this analogy you begin the day with 10 customers in the store. At each 5-minute interval thereafter another customer arrives. So at time 0, there is a pool of 10 customers. At time +5, another customer arrives. At time +10, another customer arrives, ad infinitum. You could hire 10 cashiers in order to handle this load. What would happen is that the 10 cashiers would fairly quickly get all the orders from the first 10 customers simultaneously, and then start waiting for the cook. The 10 cashiers would queue up. Casher #1 would put in the first order. Cashiers 2-9 would wait their turn. After 5-minutes, cashier number 1 would receive the meal, deliver it to customer #1, and then serve the next customer (#11) that just arrived at the 5-minute mark. Cashier #1 would take customer #11's order, then queue up and wait in line for the cook - there will be 9 other cashiers already in line, so the wait will be long. At the 10-minute mark, cashier #2 would receive a meal from the cook, deliver it to customer #2, then go on and serve the next customer (#12) that just arrived. Cashier #2 would then go and wait in line for the cook. This continues on through all the cashiers in order 1-10, then repeating, 1-10, ad infinitum. Now even though you have 10 cashiers, most of their time is spent waiting to put in an order to the cook. Starting with customer #11, all customers will wait 50-minutes for their meal. When customer #11 comes in he/she will immediately get to place an order, but it will take the cashier 45-minutes to wait for the cook to become free, and another 5-minutes for the meal to be cooked. Same is true for customer #12, and all customers from then on. Now, the question is, could you get the same throughput with fewer cashiers? Say you had 2 cashiers instead. The 10 customers are there waiting. The 2 cashiers take orders from customers #1 and #2. Cashier #1 then gives the order to the cook and waits. Cashier #2 waits in line for the cook behind cashier #1. At the 5-minute mark, the first meal is done.
Re: Fwd: [speedycgi] Speedycgi scales better than mod_perl withscriptsthat contain un-shared memory
On Wed, 17 Jan 2001, Sam Horrocks wrote: If in both the MRU/LRU case there were exactly 10 interpreters busy at all times, then you're right it wouldn't matter. But don't confuse the issues - 10 concurrent requests do *not* necessarily require 10 concurrent interpreters. The MRU has an affect on the way a stream of 10 concurrent requests are handled, and MRU results in those same requests being handled by fewer interpreters. On a side note, I'm curious about is how Apache decides that child processes are unused and can be killed off. The spawning of new processes is pretty agressive on a busy server, but if the server reaches a steady state and some processes aren't needed they should be killed off. Maybe no one has bothered to make that part very efficient since in normal circusmtances most users would prefer to have extra processes waiting around than not have enough to handle a surge and have to spawn a whole bunch. - Perrin
Re: Fwd: [speedycgi] Speedycgi scales better than mod_perlwithscripts that contain un-shared memory
Hello Sam and others If I haven't overseen, nobody so far really mentioned fastcgi. I'm asking myself why you reinvented the wheel. I summarize the differences I see: + perl scripts are more similar to standard CGI ones than with FastCGI (downside: see next point) - it seems you can't control the request loop yourself + protocol is more free than the one of FastCGI (is it?) - protocol isn't widespread (almost standard) like the one of FastCGI - seems only to support perl (so far) - doesn't seem to support external servers (on other machines) like FastCGI (does it?) Question: does speedycgi run a separate interpreter for each script, or is there one process loading and calling several perl scripts? If it's a separate process for each script, then mod_perl is sure to use less memory. As far I understand, IF you can collect several scripts together into one interpreter and IF you do preforking, I don't see essential performance related differences between mod_perl and speedy/fastcgi if you set up mod_perl with the proxy approach. With mod_perl the protocol to the backends is http, with speedy it's speedy and with fastcgi it's the fastcgi protocol. (The difference between mod_perl and fastcgi is that fastcgi uses a request loop, whereas mod_perl has it's handlers (sorry, I never really used mod_perl so I don't know exactly).) I think it's a pity that during the last years there was such little interest/support for fastcgi and now that should change with speedycgi. But why not, if the stuff that people develop can run on both and speedy is/becomes better than fastcgi. I'm developing a web application framework (called 'Eile', you can see some outdated documentation on testwww.ethz.ch/eile, I will release a new much better version soon) which currently uses fastcgi. If I can get it to run with speedycgi, I'll be glad to release it with support for both protocols. I haven't looked very close at it yet. One of the problems seems to be that I really depend on controlling the request loop (initialization, preforking etc all have to be done before the application begins serving requests, and I'm also controlling exits of childs myself). If you're interested to help me solving these issues please contact me privately. The main advantages of Eile concerning resources are a) one process/interpreter runs dozens of 'scripts' (called page-processing modules), and you don't have to dispatch requests to each of them yourself, and b) my new version does preforking. Christian. -- Web Office Christian Jaeger Corporate Communications, ETH Zentrum CH-8092 Zurich office: HG J43 e-mail: [EMAIL PROTECTED] phone: +41 (0)1 63 2 5780 [EMAIL PROTECTED] home: +41 (0)1 273 65 46 fax: +41 (0)1 63 2 3525
Re: Fwd: [speedycgi] Speedycgi scales better than mod_perl withscripts that contain un-shared memory
I have just gotten around to reading this thread I've been saving for a rainy day. Well, it's not rainy, but I'm finally getting to it. Apologizes to those who hate when when people don't snip their reply mails but I am including it so that the entire context is not lost. Sam (or others who may understand Sam's explanation), I am still confused by this explanation of MRU helping when there are 10 processes serving 10 requests at all times. I understand MRU helping when the processes are not at max, but I don't see how it helps when they are at max utilization. It seems to me that if the wait is the same for mod_perl backend processes and speedyCGI processes, that it doesn't matter if some of the speedycgi processes cycle earlier than the mod_perl ones because all 10 will always be used. I did read and reread (once) the snippets about modeling concurrency and the HTTP waiting for an accept.. But I still don't understand how MRU helps when all the processes would be in use anyway. At that point they all have an equal chance of being called. Could you clarify this with a simpler example? Maybe 4 processes and a sample timeline of what happens to those when there are enough requests to keep all 4 busy all the time for speedyCGI and a mod_perl backend? At 04:32 AM 1/6/01 -0800, Sam Horrocks wrote: Let me just try to explain my reasoning. I'll define a couple of my base assumptions, in case you disagree with them. - Slices of CPU time doled out by the kernel are very small - so small that processes can be considered concurrent, even though technically they are handled serially. Don't agree. You're equating the model with the implemntation. Unix processes model concurrency, but when it comes down to it, if you don't have more CPU's than processes, you can only simulate concurrency. Each process runs until it either blocks on a resource (timer, network, disk, pipe to another process, etc), or a higher priority process pre-empts it, or it's taken so much time that the kernel wants to give another process a chance to run. - A set of requests can be considered "simultaneous" if they all arrive and start being handled in a period of time shorter than the time it takes to service a request. That sounds OK. Operating on these two assumptions, I say that 10 simultaneous requests will require 10 interpreters to service them. There's no way to handle them with fewer, unless you queue up some of the requests and make them wait. Right. And that waiting takes place: - In the mutex around the accept call in the httpd - In the kernel's run queue when the process is ready to run, but is waiting for other processes ahead of it. So, since there is only one CPU, then in both cases (mod_perl and SpeedyCGI), processes spend time waiting. But what happens in the case of SpeedyCGI is that while some of the httpd's are waiting, one of the earlier speedycgi perl interpreters has already finished its run through the perl code and has put itself back at the front of the speedycgi queue. And by the time that Nth httpd gets around to running, it can re-use that first perl interpreter instead of needing yet another process. This is why it's important that you don't assume that Unix is truly concurrent. I also say that if you have a top limit of 10 interpreters on your machine because of memory constraints, and you're sending in 10 simultaneous requests constantly, all interpreters will be used all the time. In that case it makes no difference to the throughput whether you use MRU or LRU. This is not true for SpeedyCGI, because of the reason I give above. 10 simultaneous requests will not necessarily require 10 interpreters. What you say would be true if you had 10 processors and could get true concurrency. But on single-cpu systems you usually don't need 10 unix processes to handle 10 requests concurrently, since they get serialized by the kernel anyways. I think the CPU slices are smaller than that. I don't know much about process scheduling, so I could be wrong. I would agree with you if we were talking about requests that were coming in with more time between them. Speedycgi will definitely use fewer interpreters in that case. This url: http://www.oreilly.com/catalog/linuxkernel/chapter/ch10.html says the default timeslice is 210ms (1/5th of a second) for Linux on a PC. There's also lots of good info there on Linux scheduling. I found that setting MaxClients to 100 stopped the paging. At concurrency level 100, both mod_perl and mod_speedycgi showed similar rates with ab. Even at higher levels (300), they were comparable. That's what I would expect if both systems have a similar limit of how many interpreters they can fit in RAM at once. Shared memory would help here, since it would
Re: Fwd: [speedycgi] Speedycgi scales better than mod_perl withscripts that contain un-shared memory
Les Mikesell wrote: [cut] I don't think I understand what you mean by LRU. When I view the Apache server-status with ExtendedStatus On, it appears that the backend server processes recycle themselves as soon as they are free instead of cycling sequentially through all the available processes. Did you mean to imply otherwise or are you talking about something else? Be careful here. Note my message earlier in the thread about the misleading effect of persistent connections (HTTP 1.1). Perrin Harkins noted in another thread that it had fooled him as well as me. Not saying that's what you're seeing, just take it into account. (Quick-and-dirty test: run Netscape as the client browser; do you still see the same thing?)
Re: Fwd: [speedycgi] Speedycgi scales better than mod_perl withscripts that contain un-shared memory
Sam Horrocks wrote: A few things: - In your results, could you add the speedycgi version number (2.02), and the fact that this is using the mod_speedycgi frontend. The version numbers are gathered at runtime, so for mod_speedycgi, this would get picked up if you registered it in the Apache server header that gets sent out. I'll list the test as mod_speedycgi. The fork/exec frontend will be much slower on hello-world so I don't want people to get the wrong idea. You may want to benchmark the fork/exec version as well. If its slower than what's the point :) If mod_speedycgi is the faster way to run it, they that should be good enough, no? If you would like to contribute that test to the suite, please do so. - You may be able to eke out a little more performance by setting MaxRuns to 0 (infinite). The is set for mod_speedycgi using the SpeedyMaxRuns directive, or on the command-line using "-r0". This setting is similar to the MaxRequestsPerChild setting in apache. Will do. - My tests show mod_perl/speedy much closer than yours do, even with MaxRuns at its default value of 500. Maybe you're running on a different OS than I am - I'm using Redhat 6.2. I'm also running one rev lower of mod_perl in case that matters. I'm running the same thing, RH 6.2, I don't know if the mod_perl rev matters, but what often does matter is that I have 2 CPUs in my box, so my results often look different from other peoples. --Josh _ Joshua Chamas Chamas Enterprises Inc. NodeWorks free web link monitoring Huntington Beach, CA USA http://www.nodeworks.com1-714-625-4051
Re: Fwd: [speedycgi] Speedycgi scales better than mod_perl withscripts that contain un-shared memory
Let me just try to explain my reasoning. I'll define a couple of my base assumptions, in case you disagree with them. - Slices of CPU time doled out by the kernel are very small - so small that processes can be considered concurrent, even though technically they are handled serially. Don't agree. You're equating the model with the implemntation. Unix processes model concurrency, but when it comes down to it, if you don't have more CPU's than processes, you can only simulate concurrency. Each process runs until it either blocks on a resource (timer, network, disk, pipe to another process, etc), or a higher priority process pre-empts it, or it's taken so much time that the kernel wants to give another process a chance to run. - A set of requests can be considered "simultaneous" if they all arrive and start being handled in a period of time shorter than the time it takes to service a request. That sounds OK. Operating on these two assumptions, I say that 10 simultaneous requests will require 10 interpreters to service them. There's no way to handle them with fewer, unless you queue up some of the requests and make them wait. Right. And that waiting takes place: - In the mutex around the accept call in the httpd - In the kernel's run queue when the process is ready to run, but is waiting for other processes ahead of it. So, since there is only one CPU, then in both cases (mod_perl and SpeedyCGI), processes spend time waiting. But what happens in the case of SpeedyCGI is that while some of the httpd's are waiting, one of the earlier speedycgi perl interpreters has already finished its run through the perl code and has put itself back at the front of the speedycgi queue. And by the time that Nth httpd gets around to running, it can re-use that first perl interpreter instead of needing yet another process. This is why it's important that you don't assume that Unix is truly concurrent. I also say that if you have a top limit of 10 interpreters on your machine because of memory constraints, and you're sending in 10 simultaneous requests constantly, all interpreters will be used all the time. In that case it makes no difference to the throughput whether you use MRU or LRU. This is not true for SpeedyCGI, because of the reason I give above. 10 simultaneous requests will not necessarily require 10 interpreters. What you say would be true if you had 10 processors and could get true concurrency. But on single-cpu systems you usually don't need 10 unix processes to handle 10 requests concurrently, since they get serialized by the kernel anyways. I think the CPU slices are smaller than that. I don't know much about process scheduling, so I could be wrong. I would agree with you if we were talking about requests that were coming in with more time between them. Speedycgi will definitely use fewer interpreters in that case. This url: http://www.oreilly.com/catalog/linuxkernel/chapter/ch10.html says the default timeslice is 210ms (1/5th of a second) for Linux on a PC. There's also lots of good info there on Linux scheduling. I found that setting MaxClients to 100 stopped the paging. At concurrency level 100, both mod_perl and mod_speedycgi showed similar rates with ab. Even at higher levels (300), they were comparable. That's what I would expect if both systems have a similar limit of how many interpreters they can fit in RAM at once. Shared memory would help here, since it would allow more interpreters to run. By the way, do you limit the number of SpeedyCGI processes as well? it seems like you'd have to, or they'd start swapping too when you throw too many requests in. SpeedyCGI has an optional limit on the number of processes, but I didn't use it in my testing. But, to show that the underlying problem is still there, I then changed the hello_world script and doubled the amount of un-shared memory. And of course the problem then came back for mod_perl, although speedycgi continued to work fine. I think this shows that mod_perl is still using quite a bit more memory than speedycgi to provide the same service. I'm guessing that what happened was you ran mod_perl into swap again. You need to adjust MaxClients when your process size changes significantly. Right, but this also points out how difficult it is to get mod_perl tuning just right. My opinion is that the MRU design adapts more dynamically to the load. I believe that with speedycgi you don't have to lower the MaxClients setting, because it's able to handle a larger number of clients, at least in this test. Maybe what you're seeing is an ability to handle a larger number of requests (as opposed to clients) because of the performance benefit I mentioned above. I don't follow. When not all processes are
Re: Fwd: [speedycgi] Speedycgi scales better than mod_perl withscripts that contain un-shared memory
Sam Horrocks wrote: Don't agree. You're equating the model with the implemntation. Unix processes model concurrency, but when it comes down to it, if you don't have more CPU's than processes, you can only simulate concurrency. [...] This url: http://www.oreilly.com/catalog/linuxkernel/chapter/ch10.html says the default timeslice is 210ms (1/5th of a second) for Linux on a PC. There's also lots of good info there on Linux scheduling. Thanks for the info. This makes much more sense to me now. It sounds like using an MRU algrorithm for process selection is automatically finding the sweet spot in terms of how many processes can run within the space of one request and coming close to the ideal of never having unused processes in memory. Now I'm really looking forward to getting MRU and shared memory in the same package and seeing how high I can scale my hardware. - Perrin
Re: Fwd: [speedycgi] Speedycgi scales better than mod_perl withscripts that contain un-shared memory
Does this mean that mod_perl's memory hunger will curbed in the future using some of the neat tricks in Speedycgi? Perrin Harkins wrote: Sam Horrocks wrote: Don't agree. You're equating the model with the implemntation. Unix processes model concurrency, but when it comes down to it, if you don't have more CPU's than processes, you can only simulate concurrency. [...] This url: http://www.oreilly.com/catalog/linuxkernel/chapter/ch10.html says the default timeslice is 210ms (1/5th of a second) for Linux on a PC. There's also lots of good info there on Linux scheduling. Thanks for the info. This makes much more sense to me now. It sounds like using an MRU algrorithm for process selection is automatically finding the sweet spot in terms of how many processes can run within the space of one request and coming close to the ideal of never having unused processes in memory. Now I'm really looking forward to getting MRU and shared memory in the same package and seeing how high I can scale my hardware. - Perrin -- www.RentZone.org
Re: Fwd: [speedycgi] Speedycgi scales better than mod_perl withscripts that contain un-shared memory
Buddy Lee Haystack wrote: Does this mean that mod_perl's memory hunger will curbed in the future using some of the neat tricks in Speedycgi? Yes. The upcoming mod_perl 2 (running on Apache 2) will use MRU to select threads. Doug demoed this at ApacheCon a few months back. - Perrin
Re: Fwd: [speedycgi] Speedycgi scales better than mod_perl withscripts that contain un-shared memory
- Original Message - From: "Sam Horrocks" [EMAIL PROTECTED] To: [EMAIL PROTECTED] Cc: "mod_perl list" [EMAIL PROTECTED]; [EMAIL PROTECTED] Sent: Saturday, January 06, 2001 6:32 AM Subject: Re: Fwd: [speedycgi] Speedycgi scales better than mod_perl withscripts that contain un-shared memory Right, but this also points out how difficult it is to get mod_perl tuning just right. My opinion is that the MRU design adapts more dynamically to the load. How would this compare to apache's process management when using the front/back end approach? I'd agree that the size of one Speedy backend + one httpd would be the same or even greater than the size of one mod_perl/httpd when no memory is shared. But because the speedycgi httpds are small (no perl in them) and the number of SpeedyCGI perl interpreters is small, the total memory required is significantly smaller for the same load. Likewise, it would be helpful if you would always make the comparison to the dual httpd setup that is often used for busy sites. I think it must really boil down to the efficiency of your IPC vs. access to the full apache environment. Les Mikesell [EMAIL PROTECTED]
Re: Fwd: [speedycgi] Speedycgi scales better than mod_perl withscripts that contain un-shared memory
Sam Horrocks wrote: Don't agree. You're equating the model with the implemntation. Unix processes model concurrency, but when it comes down to it, if you don't have more CPU's than processes, you can only simulate concurrency. Hey Sam, nice module. I just installed your SpeedyCGI for a good 'ol HelloWorld benchmark it was a snap, well done. I'd like to add to the numbers below that a fair benchmark would be between mod_proxy in front of a mod_perl server and mod_speedycgi, as it would be a similar memory saving model ( this is how we often scale mod_perl )... both models would end up forwarding back to a smaller set of persistent perl interpreters. However, I did not do such a benchmark, so SpeedyCGI looses out a bit for the extra layer it has to do :( This is based on the suite at http://www.chamas.com/bench/hello.tar.gz, but I have not included the speedy test in that yet. -- Josh Test Name Test File Hits/sec Total Hits Total Time sec/Hits Bytes/Hit -- -- -- -- -- -- Apache::Registry v2.01 CGI.pm hello.cgi 451.9 27128 hits 60.03 sec 0.002213 216 bytes Speedy CGI hello.cgi 375.2 22518 hits 60.02 sec 0.002665 216 bytes Apache Server Header Tokens --- (Unix) Apache/1.3.14 OpenSSL/0.9.6 PHP/4.0.3pl1 mod_perl/1.24 mod_ssl/2.7.1
Re: Fwd: [speedycgi] Speedycgi scales better than mod_perl withscripts that contain un-shared memory
Right, but this also points out how difficult it is to get mod_perl tuning just right. My opinion is that the MRU design adapts more dynamically to the load. How would this compare to apache's process management when using the front/back end approach? Same thing applies. The front/back end approach does not change the fundamentals. I'd agree that the size of one Speedy backend + one httpd would be the same or even greater than the size of one mod_perl/httpd when no memory is shared. But because the speedycgi httpds are small (no perl in them) and the number of SpeedyCGI perl interpreters is small, the total memory required is significantly smaller for the same load. Likewise, it would be helpful if you would always make the comparison to the dual httpd setup that is often used for busy sites. I think it must really boil down to the efficiency of your IPC vs. access to the full apache environment. The reason I don't include that comparison is that it's not fundamental to the differences between mod_perl and speedycgi or LRU and MRU that I have been trying to point out. Regardless of whether you add a frontend or not, the mod_perl process selection remains LRU and the speedycgi process selection remains MRU.