[analog-help] Couple of ?'s regarding Spiders, Crawlers, Search Engines and Logging

2001-01-26 Thread Ian Stong
Title: Couple of ?'s regarding Spiders, Crawlers, Search Engines and Logging





Perhaps a bit of a side topic - but trying to see what people are doing relating to Spiders, Crawlers and Search Engines and the logging, blocking, tracking of them, etc.

Specifically wondering if it's possible to limit spiders/crawlers to only certain parts of your web site (perhaps with robot.txt files)?

Also found a 3rd party file with Analog called srch.cfg that lists search engines, etc in the file. I've mentioned it in my analog.cfg file but not sure how to make use of it. Is it used to generate a seperate report to track search engine accesses to your site or something else? 

Looking for someone who has implemented ways to track spider. crawler, search engine access to a site and ways to minimize/block that traffic.


Thanks in advance,


Ian





Re: [analog-help] Couple of ?'s regarding Spiders, Crawlers, Search Engines and Logging

2001-01-26 Thread Duke Hillard

For information about analyzing search engines requests,
you might want to look at SEARCHENGINE commands
in the Configuration Files section of the Helper Applications
page (http://www.analog.cx/helpers/index.html#conffiles) and
at the SEARCHENGINE command itself in the Analog docs
(http://www.analog.cx/docs/args.html#SEARCHENGINE).

-- Duke Hillard, University Webmaster, UL Lafayette


Ian Stong wrote:



 Perhaps a bit of a side topic - but trying to see what people are doing
 relating to Spiders, Crawlers and Search Engines and the logging, blocking,
 tracking of them, etc.

 Specifically wondering if it's possible to limit spiders/crawlers to only
 certain parts of your web site (perhaps with robot.txt files)?

 Also found a 3rd party file with Analog called srch.cfg that lists search
 engines, etc in the file.  I've mentioned it in my analog.cfg file but not
 sure how to make use of it.  Is it used to generate a seperate report to track
 search engine accesses to your site or something else?

 Looking for someone who has implemented ways to track spider. crawler, search
 engine access to a site and ways to minimize/block that traffic.

 Thanks in advance,

 Ian


begin:vcard 
n:Hillard;Duke
tel;work:337-482-5763
x-mozilla-html:TRUE
url:http://www.louisiana.edu/
org:University of Louisiana at Lafayette;University Computing Support Services
adr:;;P.O. Box 42770;Lafayette;LA;70504-2770;USA
version:2.1
email;internet:[EMAIL PROTECTED]
title:Computing Resources Coordinator
fn:Duke Hillard
end:vcard