On Nov 16, 2009, at 2:43 PM, John Lauro wrote: > Oopps, my bad... It's actually tc and not iptables. Google tc qdisc > for some info. > > You could allow your local ips go unrestricted, and throttle all other IPs > to 512kb/sec for example.
Hmmm... The problem isn't the data rate, it's the work associated with incoming requests. As soon as a 500 byte request hits, the web server has to do a lot of work. > What software is the running on? I assume it's not running under apache or > there would be some ways to tune apache. As other have mentioned, telling > the crawlers to behave themselves or totally ignore the wiki with a robots > file is probably best. Well the web server is Apache, but surprisingly Apache doesn't allow for tuning this particular case. Suppose normal request traffic looks like (A are users) Time -> A A AA A A AAA A AA A With the bot this becomes ABBBBBBBBBB A BBBBA BBA BBBBBA AABBBBBB So you can see that normal users are just swamped out of "slots". The webserver can render about 9 pages at the same time without impact, but it takes a second or more to render. At first I set MaxClients to 9, which makes it so the web server doesn't swap to death, but if the bots have 8 requests queued up, and then another 8, and another 8, regular users have no chance of decent interactivity... This may be a corner case due to slow serving, because I'm having a hard time finding a way to throttle the bots. I suppose that normally you'd just add servers... Wout.