Re: LinkWalker
I have this same robot on my site. Can i Block this robot using .htaccess files..??? Chris http://www.truefootball.com http://www.worldofjerseys.com
Re: LinkWalker
I have this same robot on my site. Can i Block this robot using .htaccess files..??? Chris http://www.truefootball.com http://www.worldofjerseys.com
Re: LinkWalker
On Tuesday 08 January 2002 01:38, Russell Coker wrote: On Mon, 7 Jan 2002 23:31, Nathan Strom wrote: I have a nasty web spider with an agent name of LinkWalker downloading everything on my site (including .tgz files). Does anyone know anything about it? It's apparantly a link-validation robot operated by a company called SevenTwentyFour Incorporated, see: http://www.seventwentyfour.com/tech.html Oops. Actually they sent me an offer of a free trial to their service (which seems quite useful). The free trial gave me some useful stats and let me fix a bunch of broken links (of course I didn't pay). You can do the same thing with wget: --spider When invoked with this option, Wget will behave as a Web spider, which means that it will not download the pages, just check that they are there. You can use it to check your bookmarks, e.g. with: wget --spider --force-html -i bookmarks.html This feature needs much more work for Wget to get close to the functionality of real WWW spiders. You'll be checking more than bookmarks but you get the idea. Jesse -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: LinkWalker
On 8 Jan 2002, at 9:56, Jesse Goerz wrote: On Tuesday 08 January 2002 01:38, Russell Coker wrote: On Mon, 7 Jan 2002 23:31, Nathan Strom wrote: I have a nasty web spider with an agent name of LinkWalker downloading everything on my site (including .tgz files). Does anyone know anything about it? It's apparantly a link-validation robot operated by a company called SevenTwentyFour Incorporated, see: http://www.seventwentyfour.com/tech.html Oops. Actually they sent me an offer of a free trial to their service (which seems quite useful). The free trial gave me some useful stats and let me fix a bunch of broken links (of course I didn't pay). You can do the same thing with wget: --spider When invoked with this option, Wget will behave as a Web spider, which means that it will not download the pages, just check that they are there. You can use it to check your bookmarks, e.g. with: wget --spider --force-html -i bookmarks.html This feature needs much more work for Wget to get close to the functionality of real WWW spiders. You'll be checking more than bookmarks but you get the idea. In case you are running ht://dig, there's a add-on on the contributed works page to parse htdig's output and generate a broken links report from it. Since htdig touches every link anyway, quite intimating. Cheers, Marcel -- __ .´ `. : :' ! Enjoy `. `´ Debian/GNU Linux `- Now even on the 5 Euro banknote! -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: LinkWalker
On Mon, 7 Jan 2002 23:31, Nathan Strom wrote: I have a nasty web spider with an agent name of LinkWalker downloading everything on my site (including .tgz files). Does anyone know anything about it? It's apparantly a link-validation robot operated by a company called SevenTwentyFour Incorporated, see: http://www.seventwentyfour.com/tech.html Oops. Actually they sent me an offer of a free trial to their service (which seems quite useful). The free trial gave me some useful stats and let me fix a bunch of broken links (of course I didn't pay). Hmm, I wonder if they REALLY downloaded those files or aborted the transfers after the first few K (needed to verify that the link was correct). Anyway I'll remove that line from my iptables configuration now! Personally, I think this is a rogue organization -- there was an entry from this spider in our logs coming from a Seven24 IP with a HTTP referrer of www.adultinterracialsexvideos.com/interracialsex/interracialgroupsexsen.htm l. Needless to say, we do not run an adult web site and that referrer site does NOT have a link to us. Likely Seven24 is trying to clutter people's logs with references as a form of advertising. A single entry in web logs does not mean much. If I blocked every origin of a bad entry in my web logs I'd be busy all day doing it... -- http://www.coker.com.au/bonnie++/ Bonnie++ hard drive benchmark http://www.coker.com.au/postal/ Postal SMTP/POP benchmark http://www.coker.com.au/projects.html Projects I am working on http://www.coker.com.au/~russell/ My home page
Re: LinkWalker
On Tuesday 08 January 2002 01:38, Russell Coker wrote: On Mon, 7 Jan 2002 23:31, Nathan Strom wrote: I have a nasty web spider with an agent name of LinkWalker downloading everything on my site (including .tgz files). Does anyone know anything about it? It's apparantly a link-validation robot operated by a company called SevenTwentyFour Incorporated, see: http://www.seventwentyfour.com/tech.html Oops. Actually they sent me an offer of a free trial to their service (which seems quite useful). The free trial gave me some useful stats and let me fix a bunch of broken links (of course I didn't pay). You can do the same thing with wget: --spider When invoked with this option, Wget will behave as a Web spider, which means that it will not download the pages, just check that they are there. You can use it to check your bookmarks, e.g. with: wget --spider --force-html -i bookmarks.html This feature needs much more work for Wget to get close to the functionality of real WWW spiders. You'll be checking more than bookmarks but you get the idea. Jesse
Re: LinkWalker
On 8 Jan 2002, at 9:56, Jesse Goerz wrote: On Tuesday 08 January 2002 01:38, Russell Coker wrote: On Mon, 7 Jan 2002 23:31, Nathan Strom wrote: I have a nasty web spider with an agent name of LinkWalker downloading everything on my site (including .tgz files). Does anyone know anything about it? It's apparantly a link-validation robot operated by a company called SevenTwentyFour Incorporated, see: http://www.seventwentyfour.com/tech.html Oops. Actually they sent me an offer of a free trial to their service (which seems quite useful). The free trial gave me some useful stats and let me fix a bunch of broken links (of course I didn't pay). You can do the same thing with wget: --spider When invoked with this option, Wget will behave as a Web spider, which means that it will not download the pages, just check that they are there. You can use it to check your bookmarks, e.g. with: wget --spider --force-html -i bookmarks.html This feature needs much more work for Wget to get close to the functionality of real WWW spiders. You'll be checking more than bookmarks but you get the idea. In case you are running ht://dig, there's a add-on on the contributed works page to parse htdig's output and generate a broken links report from it. Since htdig touches every link anyway, quite intimating. Cheers, Marcel -- __ .´ `. : :' ! Enjoy `. `´ Debian/GNU Linux `- Now even on the 5 Euro banknote!
Re: LinkWalker
[EMAIL PROTECTED] (Russell Coker) wrote in message news:[EMAIL PROTECTED]... I have a nasty web spider with an agent name of LinkWalker downloading everything on my site (including .tgz files). Does anyone know anything about it? It's apparantly a link-validation robot operated by a company called SevenTwentyFour Incorporated, see: http://www.seventwentyfour.com/tech.html I found your post while searching for information on this robot while tracking its spoor through our HTTP logs here. I've added the following to my firewall setup to stop further attacks... # crappy LinkWalker - evil spider that downloads every file including .tgz on # the site iptables -A INPUT -j logitrej -p tcp -s 209.167.50.25 -d 0.0.0.0/0 --dport www We were hit from 209.167.50.22; if you want to use iptables to block this spider, I'd block all of NETBLK-NET-SEVEN24AUU1, 209.167.50.16 - 209.167.50.31. Personally, I think this is a rogue organization -- there was an entry from this spider in our logs coming from a Seven24 IP with a HTTP referrer of www.adultinterracialsexvideos.com/interracialsex/interracialgroupsexsen.html. Needless to say, we do not run an adult web site and that referrer site does NOT have a link to us. Likely Seven24 is trying to clutter people's logs with references as a form of advertising. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: LinkWalker
[EMAIL PROTECTED] (Russell Coker) wrote in message news:[EMAIL PROTECTED]... I wasn't aware that there was any format to robots.txt, I thought that the mere presense of such a file would prevent robots from visiting. Nope; see: http://www.robotstxt.org/wc/robots.html -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: LinkWalker
Bwahahaha!! Man, that is low. Advertising to sysadmins through the access logs Sheesh. But now that you mention 7-24, I think I recognize that. I think they are a spam marketing outfit. At 02:31 PM 1/7/02 -0800, Nathan Strom wrote: Personally, I think this is a rogue organization -- there was an entry from this spider in our logs coming from a Seven24 IP with a HTTP referrer of www.adultinterracialsexvideos.com/interracialsex/interracialgroupsexsen.html. Needless to say, we do not run an adult web site and that referrer site does NOT have a link to us. Likely Seven24 is trying to clutter people's logs with references as a form of advertising. -- REMEMBER THE WORLD TRADE CENTER ---= WTC 911 =-- 0100 -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: LinkWalker
[EMAIL PROTECTED] (Russell Coker) wrote in message news:[EMAIL PROTECTED]... I have a nasty web spider with an agent name of LinkWalker downloading everything on my site (including .tgz files). Does anyone know anything about it? It's apparantly a link-validation robot operated by a company called SevenTwentyFour Incorporated, see: http://www.seventwentyfour.com/tech.html I found your post while searching for information on this robot while tracking its spoor through our HTTP logs here. I've added the following to my firewall setup to stop further attacks... # crappy LinkWalker - evil spider that downloads every file including .tgz on # the site iptables -A INPUT -j logitrej -p tcp -s 209.167.50.25 -d 0.0.0.0/0 --dport www We were hit from 209.167.50.22; if you want to use iptables to block this spider, I'd block all of NETBLK-NET-SEVEN24AUU1, 209.167.50.16 - 209.167.50.31. Personally, I think this is a rogue organization -- there was an entry from this spider in our logs coming from a Seven24 IP with a HTTP referrer of www.adultinterracialsexvideos.com/interracialsex/interracialgroupsexsen.html. Needless to say, we do not run an adult web site and that referrer site does NOT have a link to us. Likely Seven24 is trying to clutter people's logs with references as a form of advertising.
Re: LinkWalker
[EMAIL PROTECTED] (Russell Coker) wrote in message news:[EMAIL PROTECTED]... I wasn't aware that there was any format to robots.txt, I thought that the mere presense of such a file would prevent robots from visiting. Nope; see: http://www.robotstxt.org/wc/robots.html
Re: LinkWalker
site does NOT have a link to us. Likely Seven24 is trying to clutter people's logs with references as a form of advertising. ... a practise we see more and more often here as well! Even 'respectable' major isp's are starting to do it! It's a strange world ... Frank Louwers Openminds b.v.b.a.
Re: LinkWalker
Bwahahaha!! Man, that is low. Advertising to sysadmins through the access logs Sheesh. But now that you mention 7-24, I think I recognize that. I think they are a spam marketing outfit. At 02:31 PM 1/7/02 -0800, Nathan Strom wrote: Personally, I think this is a rogue organization -- there was an entry from this spider in our logs coming from a Seven24 IP with a HTTP referrer of www.adultinterracialsexvideos.com/interracialsex/interracialgroupsexsen.html. Needless to say, we do not run an adult web site and that referrer site does NOT have a link to us. Likely Seven24 is trying to clutter people's logs with references as a form of advertising. -- REMEMBER THE WORLD TRADE CENTER ---= WTC 911 =-- 0100
Re: LinkWalker
On Mon, 24 Dec 2001 06:42, Jeremy Lunn wrote: On Sun, Dec 23, 2001 at 05:41:47PM +0100, Russell Coker wrote: I have a nasty web spider with an agent name of LinkWalker downloading everything on my site (including .tgz files). Does anyone know anything about it? Surely you'd be able to disallow access to it with Apache? Yes, but using iptables is easier. My Apache setup is complex enough already... -- http://www.coker.com.au/bonnie++/ Bonnie++ hard drive benchmark http://www.coker.com.au/postal/ Postal SMTP/POP benchmark http://www.coker.com.au/projects.html Projects I am working on http://www.coker.com.au/~russell/ My home page -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: LinkWalker
On Mon, Dec 24, 2001 at 11:43:09AM +0100, Russell Coker wrote: I have a nasty web spider with an agent name of LinkWalker downloading everything on my site (including .tgz files). Does anyone know anything about it? Surely you'd be able to disallow access to it with Apache? Yes, but using iptables is easier. My Apache setup is complex enough already... But that's assumming that it comes from the same IP addr. -- Jeremy Lunn Melbourne, Australia http://www.jabber.org/ - the next generation of Instant Messaging. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: LinkWalker
quote who=Russell Coker Why don't you just update your robots.txt to explicitly specify which files you don't or do, allow spiders access to. If it's a rule-obiding spider, that will be the end of it. I wasn't aware that there was any format to robots.txt, I thought that the mere presense of such a file would prevent robots from visiting. http://www.searchtools.com/robots/robots-txt.html - Jeff -- Funny, I have no trouble distinguishing my mobile phone from the others because it's in my _own fucking pocket_! - Mobile Rage -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: LinkWalker
On Mon, 24 Dec 2001 06:42, Jeremy Lunn wrote: On Sun, Dec 23, 2001 at 05:41:47PM +0100, Russell Coker wrote: I have a nasty web spider with an agent name of LinkWalker downloading everything on my site (including .tgz files). Does anyone know anything about it? Surely you'd be able to disallow access to it with Apache? Yes, but using iptables is easier. My Apache setup is complex enough already... -- http://www.coker.com.au/bonnie++/ Bonnie++ hard drive benchmark http://www.coker.com.au/postal/ Postal SMTP/POP benchmark http://www.coker.com.au/projects.html Projects I am working on http://www.coker.com.au/~russell/ My home page
Re: LinkWalker
On Mon, Dec 24, 2001 at 11:43:09AM +0100, Russell Coker wrote: I have a nasty web spider with an agent name of LinkWalker downloading everything on my site (including .tgz files). Does anyone know anything about it? Surely you'd be able to disallow access to it with Apache? Yes, but using iptables is easier. My Apache setup is complex enough already... But that's assumming that it comes from the same IP addr. -- Jeremy Lunn Melbourne, Australia http://www.jabber.org/ - the next generation of Instant Messaging.
Re: LinkWalker
quote who=Russell Coker Why don't you just update your robots.txt to explicitly specify which files you don't or do, allow spiders access to. If it's a rule-obiding spider, that will be the end of it. I wasn't aware that there was any format to robots.txt, I thought that the mere presense of such a file would prevent robots from visiting. http://www.searchtools.com/robots/robots-txt.html - Jeff -- Funny, I have no trouble distinguishing my mobile phone from the others because it's in my _own fucking pocket_! - Mobile Rage
LinkWalker
I have a nasty web spider with an agent name of LinkWalker downloading everything on my site (including .tgz files). Does anyone know anything about it? I've added the following to my firewall setup to stop further attacks... # crappy LinkWalker - evil spider that downloads every file including .tgz on # the site iptables -A INPUT -j logitrej -p tcp -s 209.167.50.25 -d 0.0.0.0/0 --dport www -- http://www.coker.com.au/bonnie++/ Bonnie++ hard drive benchmark http://www.coker.com.au/postal/ Postal SMTP/POP benchmark http://www.coker.com.au/projects.html Projects I am working on http://www.coker.com.au/~russell/ My home page -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: LinkWalker
Why don't you just update your robots.txt to explicitly specify which files you don't or do, allow spiders access to. If it's a rule-obiding spider, that will be the end of it. On Sun, Dec 23, 2001 at 05:41:47PM +0100, Russell Coker wrote: I have a nasty web spider with an agent name of LinkWalker downloading everything on my site (including .tgz files). Does anyone know anything about it? I've added the following to my firewall setup to stop further attacks... # crappy LinkWalker - evil spider that downloads every file including .tgz on # the site iptables -A INPUT -j logitrej -p tcp -s 209.167.50.25 -d 0.0.0.0/0 --dport www -- http://www.coker.com.au/bonnie++/ Bonnie++ hard drive benchmark http://www.coker.com.au/postal/ Postal SMTP/POP benchmark http://www.coker.com.au/projects.html Projects I am working on http://www.coker.com.au/~russell/ My home page -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED] -- Nick Jennings -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: LinkWalker
On Sun, 23 Dec 2001 20:28, Nick Jennings wrote: Why don't you just update your robots.txt to explicitly specify which files you don't or do, allow spiders access to. If it's a rule-obiding spider, that will be the end of it. I wasn't aware that there was any format to robots.txt, I thought that the mere presense of such a file would prevent robots from visiting. As for rule-abiding spiders, such programs will not download files ending in .wav, .mp3, .gz, .tgz, or .zip so I won't even see them. That's why I usually don't even notice responsible web spiders such as google when browsing my web logs! On Sun, Dec 23, 2001 at 05:41:47PM +0100, Russell Coker wrote: I have a nasty web spider with an agent name of LinkWalker downloading everything on my site (including .tgz files). Does anyone know anything about it? I've added the following to my firewall setup to stop further attacks... -- http://www.coker.com.au/bonnie++/ Bonnie++ hard drive benchmark http://www.coker.com.au/postal/ Postal SMTP/POP benchmark http://www.coker.com.au/projects.html Projects I am working on http://www.coker.com.au/~russell/ My home page -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: LinkWalker
On Sun, Dec 23, 2001 at 09:17:54PM +0100, Russell Coker wrote: I wasn't aware that there was any format to robots.txt, I thought that the mere presense of such a file would prevent robots from visiting. Here is an example of my robots.txt User-agent: * Disallow: /webalizer/ Disallow: contacts.txt Disallow: /dl/ As for rule-abiding spiders, such programs will not download files ending in .wav, .mp3, .gz, .tgz, or .zip so I won't even see them. That's why I usually don't even notice responsible web spiders such as google when browsing my web logs! Hmm, I have had spiders grab .tgz's from me before, but not anymore. User-agent can be set to a specific spider agent-name, or * for all spiders. -- Nick Jennings -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: LinkWalker
You should be able to tell if it cares about robots.txt by looking in the logs to see if it's downloading /robots.txt. If it is then something like: User-agent: LinkWalker Disallow: / will keep it off your site. If it doesn't, then iptables will keep it away. Robots info: http://www.global-positioning.com/robots_text_file/index.html The fact that it downloads binaries too makes me think it's a site sucker and not a legit spider. At 12:30 PM 12/23/01 -0800, Nick Jennings wrote: On Sun, Dec 23, 2001 at 09:17:54PM +0100, Russell Coker wrote: I wasn't aware that there was any format to robots.txt, I thought that the mere presense of such a file would prevent robots from visiting. ---=REMEMBER THE WORLD TRADE CENTER=--- ___/` WTC 911 `\___ 0100 -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: LinkWalker
On Sun, Dec 23, 2001 at 05:41:47PM +0100, Russell Coker wrote: I have a nasty web spider with an agent name of LinkWalker downloading everything on my site (including .tgz files). Does anyone know anything about it? Surely you'd be able to disallow access to it with Apache? -- Jeremy Lunn Melbourne, Australia http://www.jabber.org/ - the next generation of Instant Messaging. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
LinkWalker
I have a nasty web spider with an agent name of LinkWalker downloading everything on my site (including .tgz files). Does anyone know anything about it? I've added the following to my firewall setup to stop further attacks... # crappy LinkWalker - evil spider that downloads every file including .tgz on # the site iptables -A INPUT -j logitrej -p tcp -s 209.167.50.25 -d 0.0.0.0/0 --dport www -- http://www.coker.com.au/bonnie++/ Bonnie++ hard drive benchmark http://www.coker.com.au/postal/ Postal SMTP/POP benchmark http://www.coker.com.au/projects.html Projects I am working on http://www.coker.com.au/~russell/ My home page
Re: LinkWalker
Why don't you just update your robots.txt to explicitly specify which files you don't or do, allow spiders access to. If it's a rule-obiding spider, that will be the end of it. On Sun, Dec 23, 2001 at 05:41:47PM +0100, Russell Coker wrote: I have a nasty web spider with an agent name of LinkWalker downloading everything on my site (including .tgz files). Does anyone know anything about it? I've added the following to my firewall setup to stop further attacks... # crappy LinkWalker - evil spider that downloads every file including .tgz on # the site iptables -A INPUT -j logitrej -p tcp -s 209.167.50.25 -d 0.0.0.0/0 --dport www -- http://www.coker.com.au/bonnie++/ Bonnie++ hard drive benchmark http://www.coker.com.au/postal/ Postal SMTP/POP benchmark http://www.coker.com.au/projects.html Projects I am working on http://www.coker.com.au/~russell/ My home page -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED] -- Nick Jennings
Re: LinkWalker
On Sun, 23 Dec 2001 20:28, Nick Jennings wrote: Why don't you just update your robots.txt to explicitly specify which files you don't or do, allow spiders access to. If it's a rule-obiding spider, that will be the end of it. I wasn't aware that there was any format to robots.txt, I thought that the mere presense of such a file would prevent robots from visiting. As for rule-abiding spiders, such programs will not download files ending in .wav, .mp3, .gz, .tgz, or .zip so I won't even see them. That's why I usually don't even notice responsible web spiders such as google when browsing my web logs! On Sun, Dec 23, 2001 at 05:41:47PM +0100, Russell Coker wrote: I have a nasty web spider with an agent name of LinkWalker downloading everything on my site (including .tgz files). Does anyone know anything about it? I've added the following to my firewall setup to stop further attacks... -- http://www.coker.com.au/bonnie++/ Bonnie++ hard drive benchmark http://www.coker.com.au/postal/ Postal SMTP/POP benchmark http://www.coker.com.au/projects.html Projects I am working on http://www.coker.com.au/~russell/ My home page
Re: LinkWalker
On Sun, Dec 23, 2001 at 09:17:54PM +0100, Russell Coker wrote: I wasn't aware that there was any format to robots.txt, I thought that the mere presense of such a file would prevent robots from visiting. Here is an example of my robots.txt User-agent: * Disallow: /webalizer/ Disallow: contacts.txt Disallow: /dl/ As for rule-abiding spiders, such programs will not download files ending in .wav, .mp3, .gz, .tgz, or .zip so I won't even see them. That's why I usually don't even notice responsible web spiders such as google when browsing my web logs! Hmm, I have had spiders grab .tgz's from me before, but not anymore. User-agent can be set to a specific spider agent-name, or * for all spiders. -- Nick Jennings
Re: LinkWalker
You should be able to tell if it cares about robots.txt by looking in the logs to see if it's downloading /robots.txt. If it is then something like: User-agent: LinkWalker Disallow: / will keep it off your site. If it doesn't, then iptables will keep it away. Robots info: http://www.global-positioning.com/robots_text_file/index.html The fact that it downloads binaries too makes me think it's a site sucker and not a legit spider. At 12:30 PM 12/23/01 -0800, Nick Jennings wrote: On Sun, Dec 23, 2001 at 09:17:54PM +0100, Russell Coker wrote: I wasn't aware that there was any format to robots.txt, I thought that the mere presense of such a file would prevent robots from visiting. ---=REMEMBER THE WORLD TRADE CENTER=--- ___/` WTC 911 `\___ 0100
Re: LinkWalker
On Sun, Dec 23, 2001 at 05:41:47PM +0100, Russell Coker wrote: I have a nasty web spider with an agent name of LinkWalker downloading everything on my site (including .tgz files). Does anyone know anything about it? Surely you'd be able to disallow access to it with Apache? -- Jeremy Lunn Melbourne, Australia http://www.jabber.org/ - the next generation of Instant Messaging.