Eric Jain wrote:

- Support for PowerPoint documents



May I ask how you extract text from PowerPoint documents? Any open
source tool, or your own code?



FYI I recently discovered "ppthtml" in this package: http://chicago.sourceforge.net/xlhtml/


Also "antiword" seems to work well for word docs.

Also also also....I use a utility from xpdf (http://www.foolabs.com/xpdf/) for pdf text
extraction.


When you get down to it, I have found that "portable c" tools (above) work better
than the pure java ones avail. To be fair however I have found that POI does work fine
for XLS docs.


- Dave


--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]






--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to