I vote for that! Make my life about 5000 times simpler :)
Marko van der Puil wrote:
> Hi,
>
> I had the same thing, sometimes the spiders are programmed VERY sloppy. I had a
> site that responed to ANY request made to its location. The mayoraty of spiders
> does not understand about single and
Robin Berjon wrote:
> But on a related issue, I got several logfiles corrupted because I log
> user-agents there and some seem to use some unicode names that confuse
> Apache and convert to \n. Does anyone else have this problem ? I don't
> think it could lead to server compromission, but it's ne
At 10:46 06/11/2000 -0800, ___cliff rayman___ wrote:
>> 64.3.57.99 - "-" [04/Nov/2000:04:36:22 -0800] "GET /../../../ HTTP/1.0" 400
>> 265 "-" "Microsoft Internet Explorer/4.40.426 (Windows 95)" 5740
>
>i don't think u have a lame spider here. i think u have a hacker trying to
>hack
>your server
Bill Moseley wrote:
>
> At 03:29 PM 11/10/00 +0100, Marko van der Puil wrote:
> >What we could do as a community is create spiderlawenforcement.org,
> >a centralized database where we keep track of spiders and how they
> >index our sites.
>
> At this point, I'd just like to figure out how to det
At 03:29 PM 11/10/00 +0100, Marko van der Puil wrote:
>What we could do as a community is create spiderlawenforcement.org,
>a centralized database where we keep track of spiders and how they
>index our sites.
It's an issue weekly, but hasn't become that much of a problem yet. The
bad spiders cou
Hi,
I had the same thing, sometimes the spiders are programmed VERY sloppy. I had a
site that responed to ANY request made to its location. The mayoraty of spiders
does not understand about single and double qoutes or if you leave quotes out of
your HREF's at all. also I understand that absolute
Bill Moseley wrote:
> But it's amazing how many are just lame in that they take perfectly good
> HREF tags and mess them up in the request. For example, every day I see
> many requests from Novell's BorderManager where they forgot to convert HTML
> entities in HREFs before making the request.
>
This is slightly OT, but any solution I use will be mod_perl, of course.
I'm wondering how people deal with spiders. I don't mind being spidered as
long as it's a well behaved spider and follows robots.txt. And at this
point I'm not concerned with the load spiders put on the server (and I know