> Let me just try to explain my reasoning.  I'll define a couple of my
 > base assumptions, in case you disagree with them.
 > 
 > - Slices of CPU time doled out by the kernel are very small - so small
 > that processes can be considered concurrent, even though technically
 > they are handled serially.

 Don't agree.  You're equating the model with the implemntation.
 Unix processes model concurrency, but when it comes down to it, if you
 don't have more CPU's than processes, you can only simulate concurrency.

 Each process runs until it either blocks on a resource (timer, network,
 disk, pipe to another process, etc), or a higher priority process
 pre-empts it, or it's taken so much time that the kernel wants to give
 another process a chance to run.

 > - A set of requests can be considered "simultaneous" if they all arrive
 > and start being handled in a period of time shorter than the time it
 > takes to service a request.

 That sounds OK.

 > Operating on these two assumptions, I say that 10 simultaneous requests
 > will require 10 interpreters to service them.  There's no way to handle
 > them with fewer, unless you queue up some of the requests and make them
 > wait.

 Right.  And that waiting takes place:

    - In the mutex around the accept call in the httpd

    - In the kernel's run queue when the process is ready to run, but is
      waiting for other processes ahead of it.

 So, since there is only one CPU, then in both cases (mod_perl and
 SpeedyCGI), processes spend time waiting.  But what happens in the
 case of SpeedyCGI is that while some of the httpd's are waiting,
 one of the earlier speedycgi perl interpreters has already finished
 its run through the perl code and has put itself back at the front of
 the speedycgi queue.  And by the time that Nth httpd gets around to
 running, it can re-use that first perl interpreter instead of needing
 yet another process.

 This is why it's important that you don't assume that Unix is truly
 concurrent.

 > I also say that if you have a top limit of 10 interpreters on your
 > machine because of memory constraints, and you're sending in 10
 > simultaneous requests constantly, all interpreters will be used all the
 > time.  In that case it makes no difference to the throughput whether you
 > use MRU or LRU.

 This is not true for SpeedyCGI, because of the reason I give above.
 10 simultaneous requests will not necessarily require 10 interpreters.

 > >  What you say would be true if you had 10 processors and could get
 > >  true concurrency.  But on single-cpu systems you usually don't need
 > >  10 unix processes to handle 10 requests concurrently, since they get
 > >  serialized by the kernel anyways.
 > 
 > I think the CPU slices are smaller than that.  I don't know much about
 > process scheduling, so I could be wrong.  I would agree with you if we
 > were talking about requests that were coming in with more time between
 > them.  Speedycgi will definitely use fewer interpreters in that case.

 This url:

    http://www.oreilly.com/catalog/linuxkernel/chapter/ch10.html

 says the default timeslice is 210ms (1/5th of a second) for Linux on a PC.
 There's also lots of good info there on Linux scheduling.

 > >  I found that setting MaxClients to 100 stopped the paging.  At concurrency
 > >  level 100, both mod_perl and mod_speedycgi showed similar rates with ab.
 > >  Even at higher levels (300), they were comparable.
 > 
 > That's what I would expect if both systems have a similar limit of how
 > many interpreters they can fit in RAM at once.  Shared memory would help
 > here, since it would allow more interpreters to run.
 > 
 > By the way, do you limit the number of SpeedyCGI processes as well?  it
 > seems like you'd have to, or they'd start swapping too when you throw
 > too many requests in.

 SpeedyCGI has an optional limit on the number of processes, but I didn't
 use it in my testing.

 > >  But, to show that the underlying problem is still there, I then changed
 > >  the hello_world script and doubled the amount of un-shared memory.
 > >  And of course the problem then came back for mod_perl, although speedycgi
 > >  continued to work fine.  I think this shows that mod_perl is still
 > >  using quite a bit more memory than speedycgi to provide the same service.
 > 
 > I'm guessing that what happened was you ran mod_perl into swap again. 
 > You need to adjust MaxClients when your process size changes
 > significantly.

 Right, but this also points out how difficult it is to get mod_perl
 tuning just right.  My opinion is that the MRU design adapts more
 dynamically to the load.

 > >  > >  I believe that with speedycgi you don't have to lower the MaxClients
 > >  > >  setting, because it's able to handle a larger number of clients, at
 > >  > >  least in this test.
 > >  >
 > >  > Maybe what you're seeing is an ability to handle a larger number of
 > >  > requests (as opposed to clients) because of the performance benefit I
 > >  > mentioned above.
 > > 
 > >  I don't follow.
 > 
 > When not all processes are in use, I think Speedy would handle requests
 > more quickly, which would allow it to handle n requests in less time
 > than mod_perl.  Saying it handles more clients implies that the requests
 > are simultaneous.  I don't think it can handle more simultaneous
 > requests.

 Don't agree.

 > >  > Are the speedycgi+Apache processes smaller than the mod_perl
 > >  > processes?  If not, the maximum number of concurrent requests you can
 > >  > handle on a given box is going to be the same.
 > > 
 > >  The size of the httpds running mod_speedycgi, plus the size of speedycgi
 > >  perl processes is significantly smaller than the total size of the httpd's
 > >  running mod_perl.
 > > 
 > >  The reason for this is that only a handful of perl processes are required by
 > >  speedycgi to handle the same load, whereas mod_perl uses a perl interpreter
 > >  in all of the httpds.
 > 
 > I think this is true at lower levels, but not when the number of
 > simultaneous requests gets up to the maximum that the box can handle. 
 > At that point, it's a question of how many interpreters can fit in
 > memory.  I would expect the size of one Speedy + one httpd to be about
 > the same as one mod_perl/httpd when no memory is shared.  With sharing,
 > you'd be able to run more processes.

 I'd agree that the size of one Speedy backend + one httpd would be the
 same or even greater than the size of one mod_perl/httpd when no memory
 is shared.  But because the speedycgi httpds are small (no perl in them)
 and the number of SpeedyCGI perl interpreters is small, the total memory
 required is significantly smaller for the same load.

 Sam

Reply via email to