I am trying to index the contents of my web site by asking htdig to start at each of the urls in my site *and* limit the search to that same set of urls. In other words I only want what is within that section of the internet. So for http://www.internetghana.com/digisign and http://www.ghana.com htdig should start at both sites and limit traversal to those same sites. htDig appears to hang when I do this for about ... 200 urls. I also tried creating a configuration file for each urls but htdig simply rotated between htdig.sdsu.edu (not mentioned anywhere in the url list) and my local web server (one of the start URLS) What gives ? The configuration file causing the problems is at http://www.webstar.com.gh/htdig.conf.txt BTW, strace shows the last system call as an open on the configuration file. No more system calls after that and extremely high CPU utilization. Before I go in and try and debug, I would like to know if this has been solved by anyone else. It shouldn't take this long to create a list of patterns. A cursory glance at the code showed that htdig would most likely be building the regexps ... ---------------------------------------------------------------------- To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] containing the single word "unsubscribe" in the body of the message.
