Mattmann, Chris A (388J <chris.a.mattmann <at> jpl.nasa.gov> writes:
> > Hi Jo, > > You may consider checking out Tika trunk, where we recently have a Tika JAX-RS web service [1] committed as > part of the tika-server module. You could probably wire DIH into it and accomplish the same thing. > > Cheers, > Chris > > [1] https://issues.apache.org/jira/browse/TIKA-593 > > On Feb 24, 2011, at 12:42 PM, jo wrote: > > > > > I have tried the steps indicated here: > > http://wiki.apache.org/solr/ExtractingRequestHandler > > http://wiki.apache.org/solr/ExtractingRequestHandler > > > > and when I try to parse a document nothing would happen, no error.. I have > > copied the jar files everywhere, and nothing.. can anyone give me the steps > > on how to upgrade just tika, btw, currently on 1.4.1 has tika 0.4 > > > > thank you > > > > > > -- > > View this message in context: http://lucene.472066.n3.nabble.com/upgrading-to-Tika-0-9-on-Solr-1-4-1-tp2570526p2570526.html > > Sent from the Solr - User mailing list archive at Nabble.com. > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > Chris Mattmann, Ph.D. > Senior Computer Scientist > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > Office: 171-266B, Mailstop: 171-246 > Email: chris.a.mattmann <at> nasa.gov > WWW: http://sunset.usc.edu/~mattmann/ > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > Adjunct Assistant Professor, Computer Science Department > University of Southern California, Los Angeles, CA 90089 USA > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Hey Chris I have added tika-core 0.9 and tika-parsers 0.9 to Solr1.4.1 (extraction/lib) after building them using the source provided by TIKA. Now I have an issue with this. I am working with extracting PDF content using Solr. I have added fmap.content to the configurable params as "attr_content" where I can see the entire extracted document. After the TIKA update i am not able to see attr_content appearing in the search results. When I restore it with old 0.4 TIKA jars again the attr_content appears. I didn't find any exceptions shown up there in the console. Is this a known behavior that someone have faced already? Can you guide me to resolve this? -- Surendra