>>>>> "Doug" == Doug MacEachern <[EMAIL PROTECTED]> writes:
>> My CPU-based limiter is working quite nicely. It lets oodles of
>> static pages be served, but if someone starts doing CPU intensive
>> stuff, they get booted for hogging my server machine. The nice thing
>> is that I return a standard "503" error including a "retry-after", so
>> if it is a legitimate mirroring program, it'll know how to deal with
>> the error.
Doug> choice!
It's also been very successful at catching a whole slew of user-agents
that believe in sucking senselessly. Here's my current block-list:
or m{Offline Explorer/} # bad robot!
or m{www\.gozilla\.com} # bad robot!
or m{pavuk-} # bad robot!
or m{ExtractorPro} # bad robot!
or m{WebCopier} # bad robot!
or m{MSIECrawler} # bad robot!
or m{WebZIP} # bad robot!
or m{Teleport Pro} # bad robot!
or m{NetAttache/} # bad robot!
or m{gazz/} # bad robot!
or m{geckobot} # bad robot!
or m{nttdirectory} # bad robot!
or m{Mister PiX} # bad robot!
Of course, these are just the ones that hit my site hard enough to trigger
the "exceeds 10% cumulative CPU in 15 seconds" rule. They often get
in trouble when they start invoking the 20 or 30 links in /books/
that look like /cgi/amazon?isbn=...., in SPITE of my /robots.txt that
says "don't look in /cgi". (More on that in a second...)
>> Doug - one thing I noticed is that mod_cgi isn't charging the
>> child-process time to the server anywhere between post-read-request
>> and log phases. Does that mean there's no "wait" or "waitpid" until
>> cleanup?
Doug> it should be, mod_cgi waits for the child, parsing it's header output,
Doug> etc.
mod_cgi does no waiting. :) The only wait appears to be in the cleanup
handling area. Hence, in my logger, I do this:
## first, reap any zombies so child CPU is proper:
{
my $kid = waitpid(-1, 1);
if ($kid > 0) {
# $r->log->warn("found kid $kid"); # DEBUG
redo;
}
}
And every mod_cgi would generate a process for me in this LogHandler,
so I know that mod_cgi is not reaping them. FYI. :)
>> Also, Doug, can there be only one $r->cleanup_handler? I was getting
>> intermittent results until I changed my ->cleanup_handler into a
>> push'ed loghandler. I also use ->cleanup_handler in other modules, so
>> I'm wondering if there's a conflict.
Doug> you should be able to use any number of cleanup_handlers. do you have a
Doug> small test case to reproduce the problem?
Grr. Not really. I just moved everything to LogHandlers instead. I
just no longer trust cleanup_handlers, because my tests were
consistent with "only one cleanup permitted".
--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<[EMAIL PROTECTED]> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!