I am setting up a new ProLiant DL360 G4 server with Red Hat ES Linux 4
and Apache 2.0.x.
I had copied over htdig 3.1.6 from the old server, but decided to install
3.2.0b6 with the view of using it when the server goes live in a few
days. What a nightmare.
The htdig web site
(
http://www.htdig.org/dev/htdig-3.2/) is ambiguous about 3.2.0b6 and
PDF indexing. In the FAQ 1.13 it refers to FAQ 4.9. I have the xpdf
package installed, used it with 3.1.6. When I indexed our web site -
3200 pages half of them PDF's - it took over 13 hours - yes
thirteen hours!! And then it deleted every one of the PDF's. That was
using:
external_parsers:
application/pdf->text/html /var/www/cgi-bin/doc2html.pl
in htdig.conf.
I also tried acroconv.pl but it didn't work at all.
I would appreciate some help with this.
Thanks
Bob
[EMAIL PROTECTED]
- [htdig] htdig 3.2.0b6 and PDF's Robert Isaac
- Re: [htdig] htdig 3.2.0b6 and PDF's Gustave T. Stresen-Reuter

