I am trying to index the contents of my web site by asking htdig
to start at each of the urls in my site *and* limit the search
to that same set of urls.

In other words I only want what is within that section of the
internet. 

So for http://www.internetghana.com/digisign and http://www.ghana.com
htdig should start at both sites and limit traversal to those same
sites.

htDig appears to hang when I do this for about ... 200 urls.

I also tried creating a configuration file for each urls but htdig
simply rotated between htdig.sdsu.edu (not mentioned anywhere in
the url list) and my local web server (one of the start URLS)

What gives ?

The configuration file causing the problems is at
        http://www.webstar.com.gh/htdig.conf.txt

BTW, strace shows the last system call as an open on the configuration
file. No more system calls after that and extremely high CPU utilization.

Before I go in and try and debug, I would like to know if this has been solved
by anyone else. It shouldn't take this long to create a list of patterns. A
cursory glance at the code showed that htdig would most likely be building the regexps 
...

----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the body of the message.

Reply via email to