Make sure you set KeepAlive to off in Apache.  That keeps more than one
request being queued at a time without multiple connections being open.  You
can also have haproxy do this for you with option httpclose even if it's
enabled in Apache.

You could then use --histcount with iptables rules and limit on the number
of connections / sec based on ip addresses...


> -----Original Message-----
> From: Wout Mertens [mailto:wout.mert...@gmail.com]
> Sent: Monday, November 16, 2009 9:19 AM
> To: John Lauro
> Cc: haproxy@formilux.org
> Subject: Re: Preventing bots from starving other users?
> 
> On Nov 16, 2009, at 2:43 PM, John Lauro wrote:
> 
> > Oopps, my bad...  It's actually tc and not iptables.  Google    tc
> qdisc
> > for some info.
> >
> > You could allow your local ips go unrestricted, and throttle all
> other IPs
> > to 512kb/sec for example.
> 
> Hmmm... The problem isn't the data rate, it's the work associated with
> incoming requests. As soon as a 500 byte request hits, the web server
> has to do a lot of work.
> 
> > What software is the running on?  I assume it's not running under
> apache or
> > there would be some ways to tune apache.  As other have mentioned,
> telling
> > the crawlers to behave themselves or totally ignore the wiki with a
> robots
> > file is probably best.
> 
> Well the web server is Apache, but surprisingly Apache doesn't allow
> for tuning this particular case. Suppose normal request traffic looks
> like (A are users)
> 
> Time ->
> 
> A  A   AA  A    A   AAA  A    AA A
> 
> With the bot this becomes
> 
> ABBBBBBBBBB A BBBBA BBA BBBBBA AABBBBBB
> 
> So you can see that normal users are just swamped out of "slots". The
> webserver can render about 9 pages at the same time without impact, but
> it takes a second or more to render. At first I set MaxClients to 9,
> which makes it so the web server doesn't swap to death, but if the bots
> have 8 requests queued up, and then another 8, and another 8, regular
> users have no chance of decent interactivity...
> 
> This may be a corner case due to slow serving, because I'm having a
> hard time finding a way to throttle the bots. I suppose that normally
> you'd just add servers...
> 
> Wout.
> 
> No virus found in this incoming message.
> Checked by AVG - www.avg.com
> Version: 8.5.425 / Virus Database: 270.14.60/2495 - Release Date:
> 11/16/09 07:43:00


Reply via email to