Antiword would be hard to inject into Nutch as it is not Java based. It will
reqier native calls.

Alexander

2008/11/12 Sertic Mirko, Bedag <[EMAIL PROTECTED]>

> Hi
>
> You can also use a tool called "antiword" to extract the text from a .doc
> file, and then
> give the text to lucene.
>
> See here : http://en.wikipedia.org/wiki/Antiword
>
> Regards
> Mirko
>
> -----Ursprüngliche Nachricht-----
> Von: dipesh [mailto:[EMAIL PROTECTED]
> Gesendet: Mittwoch, 12. November 2008 04:38
> An: java-user@lucene.apache.org
> Betreff: Parsing MSWord
>
> Hello,
> I wanted to know if there are classes in Lucene that support parsing MSWord
> documents.
> Many thanks,
> Dipesh
>
> ----------------------------------------
> "Help Ever Hurt Never"- Baba
>



-- 
Best Regards
Alexander Aristov

Reply via email to