excel for indexing the actual content

Adrien Grand Sun, 27 Jan 2013 09:54:00 -0800

Have you tried using the PDFParser [1] and the OfficeParser [2]
classes from Tika?


This question seems to be more appropriate for the Tika user mailing list [3]?

[1] 
http://tika.apache.org/1.3/api/org/apache/tika/parser/pdf/PDFParser.html#parse(java.io.InputStream,
org.xml.sax.ContentHandler, org.apache.tika.metadata.Metadata,
org.apache.tika.parser.ParseContext)
[2] 
http://tika.apache.org/1.3/api/org/apache/tika/parser/microsoft/OfficeParser.html#parse(java.io.InputStream,
org.xml.sax.ContentHandler, org.apache.tika.metadata.Metadata,
org.apache.tika.parser.ParseContext)
[3] http://tika.apache.org/mail-lists.html

-- 
Adrien

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: Readers for extracting textual info from pd/doc/excel for indexing the actual content

Reply via email to