On Wed, 29 Nov 2006, Arianna Arona wrote:

> I've read all the FAQ but I can't get off of these two (related?) problems:
> 1) if I run rundig as root, while parsing pdf file I get:
> 
> PDF::parse(http://segramm.dico.unimi.it/common/docs/contratti/piano_utilizzo.pdf)
> PDF::parse: error running pdf_parser on
>             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
> http://segramm.dico.unimi.it/common/docs/contratti/piano_utilizzo.pdf
>  size = 78585

This message results from a failure to find the 'acroread' program,
which is what htdig used to use as an internal PDF parser. This support
was completely removed from later versions. Unless you have an old
version of acroread floating around, this is approach is probably a dead
end.

> But if I run it as normal user I get this different error:
> 
> 
> PDF::setContents(78585 bytes)
> PDF::parse(http://segramm.dico.unimi.it/common/docs/contratti/piano_utilizzo.pdf)
> PDF::parse: cannot open
>             ^^^^^^^^^^^
> //usr/share/webapps/htdig/3.1.6-r7/hostroot/htdig/db/htdig14605.pdf

In this case an attempt is being made to create a temporary file in
/usr/share/webapps/htdig..., which is a location your normal user
doesn't have permission to write to.

> WHAT? My htdig.conf says: database_dir:           /tmp/db
> usr/share/webapps/htdig/3.1.6-r7/hostroot/htdig/db was the *original*
> configured database_dir I commented out.

The path is derived from TMPDIR environment variable, which is probably
being set in your rundig script. By default this variable is set based
upon the DBDIR path also defined in your rundig script.

> So, can anybody tell me something about how to solve this?

Use external parsers. They should override the old pdf_parser setup
which is leading to the problems above.

> 2) I've installed rtf2html and in doc2html.pl I've set up the full path
> to che executable. In htdig.conf I've added:
> external_parser:        application/pdf->text/html
> /usr/local/script/doc2html.pl \
>                         application/rtf->text/html
> /usr/local/script/doc2html.pl \
>                         text/rtf->text/html /usr/local/script/doc2html.pl \

The attribute name is external_parsers, not external_parser. Also, you
don't need the final \.

Jim

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
ht://Dig general mailing list: <[email protected]>
ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html
List information (subscribe/unsubscribe, etc.)
https://lists.sourceforge.net/lists/listinfo/htdig-general

Reply via email to