Hi,
 
I've used conv_doc.pl (with XPDF) to index PDF documents.  I am now trying to index MS Word documents, but am having problems.
 
I've copied the conv_doc.pl script into /usr/local/bin, which contains the line:
 
$CATDOC = "/usr/local/bin/catdoc";
 
I've installed the CATDOC package (which has placed the catdoc binary in /usr/local/bin and /usr/local/lib)
 
I've placed the follwing line within the htdig.conf file:
 
application/msword->text/html /usr/local/bin/conv_doc.pl
 
But when I try and re-index my website (this time, with the hope of indexing word documents too), i get the following error message which apeears next to the word documents:
 
test.doc: can't determine type of file /var/www/html/htdig/dv/htdex.8KvYOL; content-type: application/msword; URL: http://10.5.1.35/sme/micro/management_self_assessment_guide/test/doc size = 11264
 
Then htmerge deletes it because no exercrpt is found.
 
??
 
Any help would be appreciated
 
Thanks,
Shams
 

Reply via email to