I am setting up a new ProLiant DL360 G4 server with Red Hat ES Linux 4 and Apache 2.0.x.

I had copied over htdig 3.1.6 from the old server, but decided to install 3.2.0b6 with the view of using it when the server goes live in a few days. What a nightmare.

The htdig web site ( http://www.htdig.org/dev/htdig-3.2/) is ambiguous about 3.2.0b6 and PDF indexing. In the FAQ 1.13 it refers to FAQ 4.9. I have the xpdf package installed, used it with 3.1.6. When I indexed our web site - 3200 pages half of them PDF's - it took over 13 hours - yes thirteen hours!! And then it deleted every one of the PDF's. That was using:

external_parsers: application/pdf->text/html /var/www/cgi-bin/doc2html.pl

in htdig.conf.

I also tried acroconv.pl but it didn't work at all.

I would appreciate some help with this.

Thanks

Bob

[EMAIL PROTECTED]


Reply via email to