Here is another benchmark for your enjoyment. Comments are welcome! ------------------------------------------------------------------- Choosing MaxClients directive It's important to specify this parameter on the basis of the resources your machine has. The C<MaxClients> directive sets the limit on the number of simultaneous requests that can be supported. No more than this number of child server processes will be created. To configure more than 256 clients, you must edit the C<HARD_SERVER_LIMIT> entry in I<httpd.h> and recompile. With a plain Apache server, it's no big deal if you run many child processes since the processes are about 1Mb (most of it shared) and don't eat a lot of your RAM. The situation is different with mod_perl, where the processes can grow to a size of 10MB and more. Now if you have C<MaxClients> set to 50: 50x10MB = 500MB. Do you have 500MB of RAM dedicated to the mod_perl server? With a high C<MaxClients>, if you get a high load the server will try to serve all requests immediately. Your CPU will have a hard time keeping up, and if the child size multiplied by a number of running children is larger than the total available RAM your server will start swapping. This will slow down everything, which in turn will make things even slower, until eventually your machine will die. It's important that you take pains to ensure that swapping does not normally happen. Swap space is an emergency pool, not a resource to be used routinely. If you are low on memory and you badly need it, buy it. Memory is cheap. We want this directive to be as small as possible, because in this way we can limit the resources used by the server children. Since we can restrict each child's process size as we will learn later, the calculation of C<MaxClients> is pretty straightforward: Total RAM Dedicated to the Webserver MaxClients = ------------------------------------ MAX child's process size So if I have 400Mb left for the mod_perl to run with, I can set C<MaxClients> to be of 40 if I know that each child is limited to 10Mb of memory. You will be wondering what will happen to your server if there are more concurrent users than C<MaxClients> at any time. This situation is signified by the following warning message in the C<error_log>: [Sun Jan 24 12:05:32 1999] [error] server reached MaxClients setting, consider raising the MaxClients setting Technically there is no problem--any connection attempts over the C<MaxClients> limit will normally be queued, up to a number based on the C<ListenBacklog> directive. When a child process is freed at the end of a different request, the connection will be served. But it B<is an error> because clients are being put in the queue rather than getting served immediately, despite the fact that they do not get an error response. The error can be allowed to persist to balance available system resources and response time, but sooner or later you will need to get more RAM so you can start more child processes. The best approach is to try not to have this condition reached at all, and if you reach it often you should start to worry about it. Fortunately the picture is different and in fact much less memory is used in knowledgeable hands, if you recall the discussion about the shared memory. We have developed this formula: Total_RAM + Shared_RAM_per_Child * MaxClients MaxClients = --------------------------------------------- Max_Process_Size - 1 which is: Total_RAM - Max_Process_Size MaxClients = --------------------------------------- Max_Process_Size - Shared_RAM_per_Child Let's roll some calculations: Total_RAM = 500Mb Max_Process_Size = 10Mb Shared_RAM_per_Child = 4Mb 500 - 10 MaxClients = --------- = 81 10 - 4 With no sharing in place 500 MaxClients = --------- = 50 10 With sharing in place if your numbers are similar to the ones in our example, you can have 60% more servers without buying more RAM (81 compared to 50). If you improve sharing and the sharing level is kept across through the child's life, let's say: Total_RAM = 500Mb Max_Process_Size = 10Mb Shared_RAM_per_Child = 8Mb 500 - 10 MaxClients = --------- = 245 10 - 8 you can have 390% more servers (245 compared to 50)! There is one more nuance to remember. The number of request per second that you server can serve won't grow linearly with raising value of the C<MaxClients>. Assuming that you have a lot of RAM available and you try to set the C<MaxClients> as big as possible you will see that starting from certain value further increasing of the C<MaxClients> value will give you no improvement in performance. The more clients are running, the more CPU time will be required, the less CPU time slices each process will receive. The response latency (the time to respond to a request) will grow, so you won't see the expected improvement. Let's use the C<Apache::Benchmark> module to help us to prove that. So this is the test handler that we have used. You can see that it does mostly CPU intensive computations. httpd/perl/Benchmark/HandlerMiddle.pm ------------------------------------- package Benchmark::HandlerMiddle; use Apache::Constants qw(:common); sub handler{ $r = shift; $r->send_http_header('text/html'); $r->print("Hello"); my $x = 100; my $y = log ($x ** 100) for (0..100); return OK; } 1; The following two files are the test specification and the extra configuration that will be added to the I<httpd.conf> before each server restart (the server is restarted for each subtest). Notice that the test specification include the httpd variables as well, therefore for each test I<httpd.conf> will be modified to include the variation of the C<MaxClients> and constant values of C<StartServers> and C<MaxRequestsPerChild>: tests/maxclients/maxclients.t ----------------------------- my $uri_prefix = "http://$c{default}{hostname}:$c{default}{port}"; my $maxclients = { name => "handler_heavy", desc => "This test tests how the MaxClients directive influence the performance of the server", uri => "$uri_prefix/benchmark_handler_middle", concurrency => [50], connections => [1000], MaxClients => [20,50,80,120], StartServers => 100, MaxRequestsPerChild => 0, }; @entries = ($maxclients); 1; tests/maxclients/maxclients.conf -------------------------------- PerlModule Benchmark::HandlerMiddle <Location /benchmark_handler_middle> SetHandler perl-script PerlHandler Benchmark::HandlerMiddle </Location> And the results (the machine under test was a monster!): MaxClients | avtime completed failed rps -------------------------------------------- 150 | 342 50000 0 791 200 | 339 50000 0 785 100 | 333 50000 0 755 250 | 402 50000 0 741 --------------------------------------------- Non-varying sub-test parameters: --------------------------------------------- MaxRequestsPerChild : 0 StartServers : 100 concurrency : 300 connections : 50000 -------------------------------------------------------------------------- When looking at the I<Requests Per Second> (rps) column you can clearly see that with concurrency level of 300, the performance is almost identical for the values 150 and 200 of C<MaxClients> , but goes down for the value of 100 (not enough processes) and we get even worse results for the value of 250. Note that we have kept the server fully loaded, since the number of concurrent requests was always higher than the number of available processes, which means that some requests were queued and were not responded immediately. When the number of processes went above 200, the processes were spending more time in the sleep state and context switching instead of doing the real processing. On the other hand with only 100 available processes the CPU was fully loaded while we had plenty of memory available. You can see that in our case number 150 was the optimal one. This leads us to interesting discovery, which we can formulate in the following way: Extending your RAM might not improve the performance if your CPU is already fully loaded with the current number of processes, and if you start more of them, you will get a degradation in performance. If on the other hand if you decide to upgrade your machine with a a very strong CPU but you have not enough memory to deploy CPU full time, you've just wasted money. You had to use this money to upgrade to less strong and less expensive CPU, and use the remainder of the budget to buy more RAM. To discover this capability of your server you just have to run the benchmarks just like we did, by playing with configuration parameters and different loads you will be able to find the underloaded or overloaded component, in our example the two were the CPU or the RAM. You can tune your machine using the reports like in our example, by analyzing either the I<Request Per Second> (I<rps>) column which shows the throughput of your server, or the I<Average processing time> (I<avtime>) column which shows the latency of your server. Take more samples to build a nicer linear graphs, and pick the value of C<MaxClients> where the curve is bending after reaching the maximum value for a throughput graph or reaching the minimum value for a latency graph. _____________________________________________________________________ Stas Bekman JAm_pH -- Just Another mod_perl Hacker http://stason.org/ mod_perl Guide http://perl.apache.org/guide mailto:[EMAIL PROTECTED] http://perl.org http://stason.org/TULARC http://singlesheaven.com http://perlmonth.com http://sourcegarden.org