Re: [htdig3-dev] URL requirements

Gilles Detillieux Wed, 10 Feb 1999 03:20:53 -0500

According to mark williamson:
> After having installed and compiled htdig successfully (woohoo!), I came
> across an aspect of the program that's not covered in the docs.
> 
> How does/should htdig handle these two urls:
> 
> http://www.somedomain.com/
> 
> http://www.somedomain.com/products.html
> 
> My observation has it that it will spider the site if it sees just a domain,
> otherwise it will index the page if one is specified.  The reason i ask
> this, is that i pointed it to a file which is basically a map of the html
> documents of a site, and it did not follow any links.  but if i point it to
> the main domain name, it does indeed follow.

What is your limit_urls_to set to?  By default it's ${start_url}, so if you
set start_url to something more restrictive, the whole dig will be thus
restricted.  Try:

start_url:      http://www.somedomain.com/products.html
limit_urls_to:  http://www.somedomain.com/

This probably belongs on the htdig list, rather than htdig3-dev.

-- 
Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930
------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the SUBJECT of the message.
Re: [htdig3-dev] URL requirements

Reply via email to