On Tue, Mar 20, 2012 at 2:56 AM, Bob Proulx <b...@proulx.com> wrote:
> francis picabia wrote:
>> Bob Proulx wrote:
>> > francis picabia wrote:
>> >> One of the most frustrating problems which can happen in apache is to
>> >> see the error:
>> >>
>> >> server reached MaxClients setting
>> >
>> > Why is it frustrating?
>> Yes, maybe you don't know this condition.
> I am familiar with the condition.  But for me it is protection.  I am
> not maintaining anything interesting enough to be slashdotted.  But
> some of the sites do get attacked by botnets.
>> Suppose you have hundreds of users who might decide to dabble in
>> some php and not know much more than their textbook examples?  In
>> this case, part of the high connection rate is merely code running
>> on the server.  It comes from the server's IP,
> That would definitely be a lot of users to hit this limit all at one
> time only from internal use.  I could imagine a teacher assigning
> homework all due at midnight that would cause a concentration of
> effect though.
>> so no, a firewall rate limit won't help.  It is particularly annoying when
>> this happens after hours and we need to understand the situation
>> posthumously.
> You would need to have a way to examine the log files and gain
> knowledge from them.  Lots of connections in a short period of time
> would be the way to find that.
> The default configuration is designed to be useful to the majority of
> users.  That is why it is a default.  But especially if you are a
> large site and need industrial strength configuration then you will
> probably need to set some industrial strength configuration.
>> > Is that an error?  Or is that protection against a denial of service
>> > attack?  I think it is protection.
>> It does protect the OS, but it doesn't protect apache.  Apache stops
>> taking new connections, and it is just as good as if the system had
>> burned to the ground in terms of what the outside world sees.
> First, I don't quite follow you.  If Apache has been limited becuase
> there isn't enough system resources (such as ram) for it to run a
> zillion processes, then if it were to try to run a zillion processes
> it would hurt everything.  By hurt I mean either swapping and
> thrashing or by having the out of memory killer invoke or other bad
> things.  That isn't good for anyone.
> Secondly Apache continues to service connections.  It isn't stuck and
> it isn't "burned to the ground".  So that is incorrect.  It just won't
> spawn any new service processes.  The existing processes will continue
> to process queued requests from the network.  They will run at machine
> speed to handle the incoming requests as fast as they can.  Most web
> browsers will simply be slower to respond.  If the server can't keep
> up to the timeout from the browser then of course the browser will
> timeout.  That is the classic "slashdot" effect.  But that isn't to
> say that Apache stops responding.  A large number of clients will get
> serviced.
> There isn't any magic pixie dust to sprinkle on.  The machine can only
> do so much.  At some point eBay had to buy a second server.  :-)
>> MaxClients isn't much of a feature from the standpoint of running a service
>> as the primary purpose of the server.  When the maximum is reached,
>> apache does nothing to get rid of the problem.  It can just stew there
>> not resolving any hits for the next few hours.   It is not as useful say
>> as the OOM killer in the Linux kernel.
> I read your words but I completely disagree with them.  The apache
> server doesn't take a nap for a few hours.  Of course if you are
> slashdotted then the wave will continue for many hours or days but
> that is simply the continuing hits from the masses.  But if you have
> already max'd out your server then your server is max'd.  Unlike
> people the machine is good at math and knows that giving 110% is just
> psychological but not actually possible.  And as far as the OOM killer
> goes, well, it is never a good thing.
>> > The default for Debian's apache2 configuration is MaxClients 150.
>> > That is fine for many systems but way too high for many light weight
>> > virtual servers for example.  Every Apache process consumes memory.
>> > The amount of memory will depend upon your configuration (whether mod
>> > php or other modules are installed) but values between 20M and 50M are
>> > typical.  On the low end of 20M per process hitting 150 clients means
>> > use of 1000M (that is one gig) of memory.  If you only had a 512M ram
>> > server instance then this would be a serious VM thrash, would slow
>> > your server to a crawl, and would generally be very painful.  The
>> > default MaxClients 150 is probably suitable for any system with 1.5G
>> > or more of memory.  On a 4G machine the default should certainly be
>> > fine.  On a busier system you would need additional performance
>> > tuning.
>> Ours was at 256, already tuned.  So the problem is, as I stated, not
>> about raising the limit, but about troubleshooting the source of
>> the problem.
> I think you have some additional problem above and beyond hitting
> MaxClients.  If you are having some issue that is causing your system
> to be completely unresponsive for hours as you say then something else
> is going wrong.
>> > Look in your access and error logs for a high number of simultaneous
>> > clients.  Tools such as munin, awstats, webalizer and others may be
>> > helpful.  I use those in addition to scanning the logs directly.
>> I hate munin.  Too much overhead for the systems where we
>> are already close to performance issues when there is
>> peak traffic.  I already poll the load and it had not increased.
>> I should add a scan of the memory usage.
>> I prefer cacti for this sort of thing.
> As I simply listed munin as an example it isn't something I say is
> required.  But munin on a node is quite low overhead.  On the server
> where it is producing graphs it can be a burden.  But on individual
> nodes it is quite light weight.  Of course there are a lot of
> available plugins.  If a particular plugin is problematic (I don't
> know of any) I would simply disable that one plugin.
> Bob

I rattled on about MaxClients because it isn't very helpful in finding
why the server stopped responding.  It would be better
if it dumped out a snapshot of what URLs it was servicing
at the moment MaxClients was reached.  And yes, in my experience,
the next thing apache generally needs after hitting the limit is
a restart of apache service to make it responsive again.  I've run
into this with Lon-Capa and merely 80 users, or Contao driven
web site with one web master writing inefficient PHP code to update
sports scores in a browser client.  It doesn't require a slashdot
event to bring down apache, moreso some code that is generating
too many concurrent hits, or not enough memory for the web app
and the expected user numbers.

The solution I've found for monitoring is to enable /server-status
with extended information.  I wait to see the problem and take
a look at what URLs are associated with the W state.

Thanks for the responses from all.

