First of all, take the latest version of catdoc. Something
like 0.90 or so.

Second there is another script around. see:
http://www.st.hhs.nl/htdig/parse_word_doc.pl

Third, there is mswordview, which translates Word 97 files
into HTML, but I don't know if someone uses that option

Fourth, catdoc sometimes fails dramaticly when a non-Word
file end with .doc and gets parsed by catdoc. It crashed
htdig at my place...

U.O. Telematica Municipale - Comune di Prato wrote:
> 
> Hi people !!! I tried to use the external parse htparsedoc from the contrib
> dir: I compiled the catdoc.c and all went OK. But when I try to run htdig,
> a core dumps. Is there another external parser available for MS Word
> documents? If not, can you tell me how to configure it?
> 
> This is what I've done with my htdig configuration.
> 
> I added this line to htdig.conf:
> 
> external_parsers:       application/msword      /usr1/htdig/bin/htparsedoc
> 
> When htdig founds a document with that MIME type, it launches htparsedoc.
> But at the end of the indexing process I found a core in the directory bin.
> 
> Ah, I run htdig on a Linux slakware 2.0.35 (Pentium Celeron 266 Mhx 64MB Ram).
> 
> Thanks a lot
> Ciao
> Gabriele
> 
> ----------------------------------------------------------
> 
>  U.O. Rete Civica - Comune di Prato
>  Via Ricasoli, 4 - 59100 Prato PO Italia
>  Tel. +39 0574616342    Fax +39 0574616003
> 
>  http://www.comune.prato.it
>  E-Mail: [EMAIL PROTECTED]
> 
> ----------------------------------------------------------
> ------------------------------------
> To unsubscribe from the htdig3-dev mailing list, send a message to
> [EMAIL PROTECTED] containing the single word "unsubscribe" in
> the SUBJECT of the message.
------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the SUBJECT of the message.

Reply via email to