Re: Dealing with spiders

2000-11-21 Thread Jimi Thompson
I vote for that! Make my life about 5000 times simpler :) Marko van der Puil wrote: > Hi, > > I had the same thing, sometimes the spiders are programmed VERY sloppy. I had a > site that responed to ANY request made to its location. The mayoraty of spiders > does not understand about single and

Re: Dealing with spiders

2000-11-11 Thread ___cliff rayman___
Robin Berjon wrote: > But on a related issue, I got several logfiles corrupted because I log > user-agents there and some seem to use some unicode names that confuse > Apache and convert to \n. Does anyone else have this problem ? I don't > think it could lead to server compromission, but it's ne

Re: Dealing with spiders

2000-11-11 Thread Robin Berjon
At 10:46 06/11/2000 -0800, ___cliff rayman___ wrote: >> 64.3.57.99 - "-" [04/Nov/2000:04:36:22 -0800] "GET /../../../ HTTP/1.0" 400 >> 265 "-" "Microsoft Internet Explorer/4.40.426 (Windows 95)" 5740 > >i don't think u have a lame spider here. i think u have a hacker trying to >hack >your server

Re: Dealing with spiders

2000-11-10 Thread Christoph Wernli
Bill Moseley wrote: > > At 03:29 PM 11/10/00 +0100, Marko van der Puil wrote: > >What we could do as a community is create spiderlawenforcement.org, > >a centralized database where we keep track of spiders and how they > >index our sites. > > At this point, I'd just like to figure out how to det

Re: Dealing with spiders

2000-11-10 Thread Bill Moseley
At 03:29 PM 11/10/00 +0100, Marko van der Puil wrote: >What we could do as a community is create spiderlawenforcement.org, >a centralized database where we keep track of spiders and how they >index our sites. It's an issue weekly, but hasn't become that much of a problem yet. The bad spiders cou

Re: Dealing with spiders

2000-11-10 Thread Marko van der Puil
Hi, I had the same thing, sometimes the spiders are programmed VERY sloppy. I had a site that responed to ANY request made to its location. The mayoraty of spiders does not understand about single and double qoutes or if you leave quotes out of your HREF's at all. also I understand that absolute

Re: Dealing with spiders

2000-11-06 Thread ___cliff rayman___
Bill Moseley wrote: > But it's amazing how many are just lame in that they take perfectly good > HREF tags and mess them up in the request. For example, every day I see > many requests from Novell's BorderManager where they forgot to convert HTML > entities in HREFs before making the request. >

Dealing with spiders

2000-11-04 Thread Bill Moseley
This is slightly OT, but any solution I use will be mod_perl, of course. I'm wondering how people deal with spiders. I don't mind being spidered as long as it's a well behaved spider and follows robots.txt. And at this point I'm not concerned with the load spiders put on the server (and I know