|
Hi,
I've used conv_doc.pl (with XPDF) to index PDF
documents. I am now trying to index MS Word documents, but am having
problems.
I've copied the conv_doc.pl script into
/usr/local/bin, which contains the line:
$CATDOC = "/usr/local/bin/catdoc";
I've installed the CATDOC package (which has placed
the catdoc binary in /usr/local/bin and /usr/local/lib)
I've placed the follwing line within the htdig.conf
file:
application/msword->text/html
/usr/local/bin/conv_doc.pl
But when I try and re-index my website (this
time, with the hope of indexing word documents too), i get the following error
message which apeears next to the word documents:
test.doc: can't determine type of file
/var/www/html/htdig/dv/htdex.8KvYOL; content-type: application/msword; URL: http://10.5.1.35/sme/micro/management_self_assessment_guide/test/doc
size = 11264
Then htmerge deletes it because no exercrpt is
found.
??
Any help would be appreciated
Thanks,
Shams
|
- Re: [htdig] using conv_doc.pl to index MS Word docume... shams khan
- Re: [htdig] using conv_doc.pl to index MS Word d... Gilles Detillieux
- Re: [htdig] using conv_doc.pl to index MS Wo... shams khan
- Re: [htdig] using conv_doc.pl to index M... David Adams
- Re: using doc2html (was [h... shams khan
- Re: using doc2html (was ... David Adams
- [htdig] running doc2html.pl... Juan Pablo Aqueveque
- Re: [htdig] running doc... David Adams
- Re: [htdig] running doc... Juan Pablo Aqueveque
- Re: [htdig] running doc... Dan Langille
- Re: [htdig] running doc... David Adams

