Jane,

You can read about the "robots.txt" file at

http://lucene.apache.org/nutch/bot.html#Sysadmins%2Frobots.txt

Basically, a "robots.txt" file is a text file that you create and copy to
the root of your website. Crawlers and robots read the file to determine if
it's okay to crawl your website. If you do not have a "robots.txt" file on
your web server, my understanding is crawlers and robots assume that it is
therefore okay to crawl your website. It sounds like this is the opposite of
what you thought.

The page listed above has the commands you need to add to your "robots.txt"
file to stop the nutch robot from crawling your website. If you are using MS
Windows, you can use NotePad to create and edit the file.

I hope this helps.

Richard

-----Original Message-----
From: Jane de Silva [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, November 22, 2005 5:23 AM
To: nutch-agent@lucene.apache.org
Subject: Spider Causing Contact Form Submissions

Hi

For the past couple of weeks I have been receiving blank contact form
submissions caused by sitesell.com 's use of your software. When this first
started happening I contacted sitesell and was assured that the problem
would be fixed. For a few days it appeared that it had been attended to, but
the emails bagan to appear again this weekend and are still continuing to
arrive. All sitesell will now say is that I should edit the robotstxt file,
and they have essentially washed their hands of the problem. I run a very
small part-time concern, do not have the foggiet idea about this file or how
to edit it, and in any case I consider this a point of principle...

If sitesell or any other company finds value in using your software - and it
is certainly of no value to me! - they should have the courtesy to properly
address issues such as this when they arise. If I was forced to leave open
the door to my home because I could not locate my key or was unable, for one
reason or another, to operate the locking mechanism, it does not make it
right for a person to trespass on my property and meddle with its contents.
Although I have no problem in principle with the spider accessing my site
(given that their reasons for doing so are not against my best interests),
it should not cause me inconvenience by so doing. 

Sitesell's apparent lack of interest in my problem displays a cavalier
attitude and is discourteous and unneighbourly. I would be grateful if you
would help me put an end to their unwelcome intrusion into my life.

Kind regards

Jane de SIlva
www.indigographics.co.uk 
[EMAIL PROTECTED]

 


Reply via email to