Sam Horrocks wrote:
>  say they take two slices, and interpreters 1 and 2 get pre-empted and
>  go back into the queue.  So then requests 5/6 in the queue have to use
>  other interpreters, and you expand the number of interpreters in use.
>  But still, you'll wind up using the smallest number of interpreters
>  required for the given load and timeslice.  As soon as those 1st and
>  2nd perl interpreters finish their run, they go back at the beginning
>  of the queue, and the 7th/ 8th or later requests can then use them, etc.
>  Now you have a pool of maybe four interpreters, all being used on an MRU
>  basis.  But it won't expand beyond that set unless your load goes up or
>  your program's CPU time requirements increase beyond another timeslice.
>  MRU will ensure that whatever the number of interpreters in use, it
>  is the lowest possible, given the load, the CPU-time required by the
>  program and the size of the timeslice.

You know, I had brief look through some of the SpeedyCGI code yesterday,
and I think the MRU process selection might be a bit of a red herring. 
I think the real reason Speedy won the memory test is the way it spawns
processes.

If I understand what's going on in Apache's source, once every second it
has a look at the scoreboard and says "less than MinSpareServers are
idle, so I'll start more" or "more than MaxSpareServers are idle, so
I'll kill one".  It only kills one per second.  It starts by spawning
one, but the number spawned goes up exponentially each time it sees
there are still not enough idle servers, until it hits 32 per second. 
It's easy to see how this could result in spawning too many in response
to sudden load, and then taking a long time to clear out the unnecessary
ones.

In contrast, Speedy checks on every request to see if there are enough
backends running.  If there aren't, it spawns more until there are as
many backends as queued requests.  That means it never overshoots the
mark.

Going back to your example up above, if Apache actually controlled the
number of processes tightly enough to prevent building up idle servers,
it wouldn't really matter much how processes were selected.  If after
the 1st and 2nd interpreters finish their run they went to the end of
the queue instead of the beginning of it, that simply means they will
sit idle until called for instead of some other two processes sitting
idle until called for.  If the systems were both efficient enough about
spawning to only create as many interpreters as needed, none of them
would be sitting idle and memory usage would always be as low as
possible.

I don't know if I'm explaining this very well, but the gist of my theory
is that at any given time both systems will require an equal number of
in use interpreters to do an equal amount of work and the diffirentiator
between the two is Apache's relatively poor estimate of how many
processes should be available at any given time.  I think this theory
matches up nicely with the results of Sam's tests: when MaxClients
prevents Apache from spawning too many processes, both systems have
similar performance characteristics.

There are some knobs to twiddle in Apache's source if anyone is
interested in playing with it.  You can change the frequency of the
checks and the maximum number of servers spawned per check.  I don't
have much motivation to do this investigation myself, since I've already
tuned our MaxClients and process size constraints to prevent problems
with our application.

- Perrin

Reply via email to