Modperlers,

Since we've had a little spirited debate on this issue..., I think it
might be nice to go into some detail on this.  Well... here are my
ideas.

As Perrin has brought up, if your doing a lot of queries, and that's
the primary focus of your perl scripts, then parallelism is key.
However, since modproxy is still going to slurp this data up really
quickly, you don't want a 1:1 ratio.

I think if the database is running on the local machine a 1:2 ratio
CPU to modperl processes could very likely make a lot of sense.  Even
though your processes are io bound a bit, this is really not true.
Because they are actually CPU bound with that binding being on the
other process.  If your database is not local, you need more modperl
processes.  Now the ratio of modproxy:modperl processes is not cut and
dry.  I think the better thing to do is consider what your modperl
processes are doing and determine it more in terms of modperl:CPU,
then allow the modproxy to have as many processes as it wants... they're
really small so memory isn't so much the issue.

So what would be the benefit of having less modperl processes in a
modproxy enviroment?  Well this is sort of a well, duh sort of
scenario.  Less memory consumption, so that more memory can be grabbed
by having more modproxy processes.  Also, less modperls, less
oracle/mysql/postgre processes..., so less memory consumption, and
less process switching... which means more CPU time for actual
PROCESSING.  The mod_proxy CPU time consumption should be very trivial
because they use synchronous i/o to stream bits to clients.  (Memory
consumption is really important in a Oracle database enviro as we all
know)

So, overall..., I think that you should consider how many modperl
processes you want completely seperately from how many modproxy
processes you want.  But rather on a ratio of how many CPUs you have
considering primarily what their "bound" by.  If you have a remote
Database to query, you'll want that ratio to be considerably higher
than 1:1, probably something like 4:1, but if someone has some "field"
data on that I would be interested in that.  If all your queries are
local, you'll want to keep that number really low... 1:1 might even
make sense, because no matter how you consider it... you really are
CPU bound.  (Whether that CPU time is actually inside your
modperl process isn't really something to consider)  But,
as Perrin noted, if you have some scripts that query TONS of data, and
others relatively small, the 2:1 ratio might make sense, but no matter
what 1:1 will always produce the most overall efficiency in the
"everythings local scenario".

Okay, these are my thoughts, what do you think?
Shane.

Reply via email to