According to Curtis Ireland:
> Is there any way to have start_url get its list from an SQL back-end?
> Has anyone already built a patch to handle this?
>
> Here are a couple of solutions I can think of to bi-pass the problem,
> but I'm sure I'm not alone in desiring this feature.
>
> 1) Build a PHP link built with links to all the sites we want to index.
> Have htDig use this as its start_url
> 2) Before htDig starts its database build, dump all the links to a text
> file and have the htdig.conf include this file
>
> The one problem with these two solutions is how would the limit_urls_to
> variable work? I want to make sure the links are properly indexed
> without going past the linked site.
Either solution seems workable - it all depends on what your preference
is. For the first solution, you'd need to have a limit_urls_to setting
that's liberal enough to allow through all the links that the PHP script
will spit out. You should probably set your max_hop_count to 1 to avoid
having htdig go beyond the first hop, from the PHP output to the documents
it references.
For the second solution, you could probably just leave limit_urls_to as
the default, which is the same as the value of start_url, and set your
max_hop_count to 0.
--
Gilles R. Detillieux E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba Phone: (204)789-3766
Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930
------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives: <http://www.htdig.org/mail/menu.html>
FAQ: <http://www.htdig.org/FAQ.html>