I just finished writing a chapter for Lucene in Action that deals with
that.
PDF: pdfbox.org
MS Word/Excel: jakarta.apache.org/poi
WP: http://www.google.com/search?q=java+word+perfect+parser
Note that what you need are parsers. The term Analyzer has a special
meaning in Lucene realm.
Otis
---
Is there an analyzer for WordPerfect files?
I have a need to be able to index WP files as well as MS files, pdfs, etc.
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]