At 11:25 AM +1200 4/6/00, [EMAIL PROTECTED] wrote:
>I installed htdig-3.2.0b1 on a Dec Alpha running Debian Linux for
>testing.
>Configure ran ok and everything seemed to compile and install to
>the correct directories, but when I run htdig it only grabs the index
>page and doesn't follow the links (I have tried a few different servers
>that don't have any robots.txt files etc and get the same problem).
>Htmerge and htsearch run ok but I have a very small database of 1
>document.
My guess is that you've set start_url to point to a page and left
limit_urls_to to the default. In this case, only that page will be
indexed, because only that page matches the limit_urls_to attribute.
Example:
start_url: http://www.foo.com/bar.html
-> All links off of this page won't start with this URL, so they're rejected.
Better:
start_url: http://www.foo.com/
-> All links off of this page will likely fall within the same URL-space.
Best:
start_url: http://www.htdig.org/
limit_urls_to: htdig.org
This forces all links to fall within the subdomain. URLs like
http://dev.htdig.org/ work too.
>Anything obvious that I need to do? (other than sit back and wait
>for the stable release :-) )
There is a stable release: 3.1.5. Last I checked, it was the stable
package for Debian. All previous releases (including 3.2.0b1) have
the security hole. If you missed the details of the hole, see the
Debian security updates.
Cheers,
--
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/
------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.