On Wed, 29 Nov 2006, Dan Richardson wrote: > In my htdig conf file I have: > > start_url: http://www.mydomain.co.uk/ > limit_urls_to: http://www.mydomain.co.uk/ > exclude_urls: /usr/local/htdig/conf/excludes
To pull in a file you should enclose the path in backticks. For example, exclude_urls: `/usr/local/htdig/conf/excludes` > /usr/local/htdig/conf/excludes contains: > > /cgi-bin/ .cgi action=vote&voteid bookmark.html email.html reddit.com > del.icio.us www.google.com digg.com ma.gnolia.com www.newsvine.com You might want to consider moving action=vote&voteid to the bad_querystr attribute, which exists specifically for this purpose. > As I understand it both limit_urls and exclude_urls are string patterns, but > which one takes precedence? I believe exclude_urls is tested before limit_urls_to. > I have links on my site such as: > http://reddit.com/submit?url=http://www.mydomain.co.uk/about_us/for_your_site/email.html > which contains both exclude_urls and limit_urls strings and htdig seems to > be trying to index these links, any pointers on how I can definately exclude > them from an index? You don't have more than one exclude_urls line in your config file do you? If the default definition is still lurking in the file it might be overriding your custom settings. Jim ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ ht://Dig general mailing list: <[email protected]> ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html List information (subscribe/unsubscribe, etc.) https://lists.sourceforge.net/lists/listinfo/htdig-general

