On Thursday, December 5, 2002, at 08:39 AM, Ted Stresen-Reuter wrote:

I've got a cron job running every night. It executes rundig.sh. The .conf file is specified as the conf file we use (not the default .conf file). htdig indexes our intranet. The url for the intranet (and the start url) is http://inside.hinshawlaw.com/

The problem is, however, htdig is also indexing our web site at http://www.hinshawlaw.com/
What does your limit_urls_to attribute look like? A URL should only be indexed if it contains a string defined in limit_urls_to.

Also, have you tried reindexing from scratch to rule out the possibility of the database containing some old URLs that were added using a previous configuration?

Jim



-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html


Reply via email to