Thanks Nick and thanks Arturo, for the offer to write a small guide to getting started with parsing. It might be good to create a JIRA issue for this? Arturo, can you head over to JIRA and create an issue to contribute a "get Tika parsing up and running in 5 minutes" quick start guide? Then, you could write the guide in APT format (see here [1] for an example and [2] for more detailed information), add your new guide file to your local SVN checkout, create a patch and then attach it to your new issue. I'd be happy to get it into the documentation sources.
Thanks! Cheers, Chris [1] http://svn.apache.org/repos/asf/tika/trunk/src/site/apt/formats.apt [2] http://maven.apache.org/doxia/references/apt-format.html On 7/13/10 3:54 AM, "Arturo Beltran" <arturo.belt...@uji.es> wrote: That was my "big" problem all this time, I almost went crazy. Now it works perfectly, thank you very much for your help. It might be interesting to write a small manual: "How to create a new Tika Parser for Dummies". Simply including the three steps that I have finally figured out (new Parser, tika-mimetypes.xml, list the new parser). Greetings and thanks Nick it has been a great help El 13/07/2010 12:37, Nick Burch escribió: > On Tue, 13 Jul 2010, Arturo Beltran wrote: >> I'm calling my parser using the Tika-app included, so I think I'm >> using AutoDetectParser. > > You have to explicitly tell the AutoDetectParser to try your parser, > in addition to the mime type definition > > List your new parser in: > tika-parsers/src/main/resources/META-INF/services/org.apache.tika.parser.Parser > > and I think it should then be picked up > > Nick > -- Arturo Beltran Fonollosa Institute of New Imaging Technologies (INIT): http://www.init.uji.es Geographic Information research group: http://www.geoinfo.uji.es Universitat Jaume I, Avda. de Vicente Sos Baynat s/n E-12071, Castellón, Spain mailto: arturo.belt...@uji.es ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.mattm...@jpl.nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++