Pavan awesome glad to have your interest and to have you in the community! Check out our JIRA:
https://issues.apache.org/jira/browse/TIKA My own personal recent interests in Tika are related to Named Entity Recognition (Stanford NER, CoreNLP and OpenNLP), and in Automated IR-based Geo-Gazetteers; in Audio/Video extraction, and so forth. Also in language identification (N-grams; MIT-LL’s Text.jl) and automated machine translation (Joshua, Moses). If you are interested in that type of stuff, look for stuff I reported or assigned to me, or with the label “memex”. In addition in general if you are more interested in the types of work that I’m contributing to Tika, see http://memex.jpl.nasa.gov/ Cheers, and happy holidays! Cheers, Chris ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ -----Original Message----- From: Pavan Sudheendra <pavan0...@gmail.com> Reply-To: "dev@tika.apache.org" <dev@tika.apache.org> Date: Sunday, December 20, 2015 at 9:52 AM To: "dev@tika.apache.org" <dev@tika.apache.org> Subject: Looking to contribute >Hi all, > >My name is Pavan and I'm a software engineer working at Cisco on big data >projects from the past 2 years. > >I'm looking to contribute to the Tika project and i'm wondering if I >should >start looking at the Github issues page or somewhere else? > >I've started reading the documentation and getting familiar with the build >process. > >Also, any guidance on this subject would be great. > >Thanks all. > >-- >Regards- >Pavan