According to Willy Calderon:
> I've got a few output lines from doing a rundig in which I'm being asked 
> what to index.
> 
> =========================
> host# rundig -vvv
>          1:1:
> New server: , 0

OK, the two lines above tell me htdig isn't even seeing a valid URL to
begin with.  That points the finger to a faulty start_url definition...

> Unknown host: 0/robots.txt
>   pushed
> pick: , # servers = 1
> htmerge: Unable to open word list file '/opt/www/htdig/db/db.allwords.text'.
>    Did you index anything?
>    Check your config file and try running htdig again.
> ==========================
> 
> At the moment my htdig.conf file looks something like this
> ==========================
> database_dir:           /opt/www/htdig/db
> database_base:          ${database_dir}/db
> word_db:                ${database_base}.allwords.db
> word_list:              ${database_base}.allwords.text
> config_dir:             /opt/www/htdig/conf
> common_url:             /var/www/htdocs/www/
> start_url:              ${common_dir}/index.html

Bingo!  The 3.1.x series of htdig only handles http:// URLs.  You can't
have just a bare UNIX directory pathname for a URL.  The ${common_dir}
attribute expands to a UNIX directory path.  (Even with the 3.2 betas,
which allow other protocols than HTTP, you still need to explicitly give
the protocol in the URL, even for file:/ URLs.)

-- 
Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

_______________________________________________
htdig-dev mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/htdig-dev

Reply via email to