Hi Bob, +1
Cheers. Sergey
On 23/01/17 04:17, Bob Paulin wrote:
Hi,
I'd like to propose an Apache Tika[1] connector for Apache Camel. I see
Camel uses a number of Tika components like PDFBox but it could be
interesting to have a full assortment of file parsers to convert files
to text.
The basic configuration would allow MIME type detection and parsing
files to text.
tika:detect
File/Inputstream -> camel-tika -> MIME Type
tika:parse
File/Inputstream -> camel-tika -> OutputStream in text
I have a basic implementation that I'd be happy to send in a PR but I
wanted to see if this was something the community was interested in. I
think it might be interesting to combine a project that integrates
everything with the project the parses everything. I also think having
a camel-tika component might help achieve some of Tika's 2.0 goals.
- Bob Paulin
[1] https://tika.apache.org/
[2] https://wiki.apache.org/tika/Tika2_0RoadMap
--
Sergey Beryozkin
Talend Community Coders
http://coders.talend.com/