Re: Random requests in log file

2001-08-07 Thread Nick Tonkin

On Tue, 7 Aug 2001, Christof Damian wrote:

 Bill Moseley wrote:
 
  Does everyone else see these?  What's the deal?  Are they really probes or
  some spider run amok?
  
  Right now someone is looking for things like:
  
  /r/dr
  /r/g3
  /r/sb
  /r/sw
  /r/s/2
  /r/a/booth
  /r/s/pp
  /NowPlaying
  /mymovies/list
  /terms
  /ootw/1999/oarch99_index.html
 
 the first couple look like yahoo
 
 www.yahoo.com/r/dr
 www.yahoo.com/r/sw


Yes, and I have seen plenty of cases where broken web servers or web sites
or web browsers screw up HREFs, by prepending an incorrect root uri to a
relative link.

That would be my guess, broken URLs somewhere out in space.


- nick




Re: Random requests in log file

2001-08-07 Thread Bill Moseley

At 10:24 AM 08/07/01 -0700, Nick Tonkin wrote:
  /r/dr
  /r/g3
  /r/sb
 www.yahoo.com/r/dr
 www.yahoo.com/r/sw


Yes, and I have seen plenty of cases where broken web servers or web sites
or web browsers screw up HREFs, by prepending an incorrect root uri to a
relative link.

That would be my guess, broken URLs somewhere out in space.

But why the continued hits for the wrong pages?  It's like someone spidered
an entire site, and then has gone back and is now testing all those HREFs
against our server.

Currently mod_perl is generating a 404 page.  When I block I return
FORBIDDEN, but that doesn't seem to stop the requests either.  They don't
seem to get the message...  

And isn't it correct that if they request again before CLOSE_WAIT is up
I'll need to spawn more servers?

If they are not sending requests in parallel I wonder if it would be easier
on my resources to really slow down responses as long as I don't tie up too
many of my processes.  If they ignore FORBIDDEN maybe they will see the
timeouts.

Time to look at the Throttle modules, I suppose.



Bill Moseley
mailto:[EMAIL PROTECTED]