[htdig] How can I prevent htdig from crawling and/or reporting Apache indexes?

Mark Bartlett Tue, 19 Aug 2008 12:48:53 -0700

Hi Everyone!

I am using htdig to search a robohelp generated website residing on Apache.
It seems to crawl the site but It also crawls the apache index pages and
returns those results from a search.
If I turn off the Apache Indexes htdig does not crawl the site.


A example Apache index page is at:
http://proddoc.groundworkopensource.com/Bookshelf_RoboHelp/Maintaining_GroundWork_Monitor/

I have tried adding to the exclude_urls directive in htdig.conf but I am 
unsure how to use it properly.

fwiw: I have read the FAQ 4.20 thru 4.23...

So How can I prevent htdig from crawling and/or reporting Apache indexes?

Thanks,
Mark

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
ht://Dig general mailing list: <[email protected]>
ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html
List information (subscribe/unsubscribe, etc.)
https://lists.sourceforge.net/lists/listinfo/htdig-general

[htdig] How can I prevent htdig from crawling and/or reporting Apache indexes?

Reply via email to