Mattmann, Chris A (388J <chris.a.mattmann <at> jpl.nasa.gov> writes:

> 
> Hi Jo,
> 
> You may consider checking out Tika trunk, where we recently have a Tika JAX-RS
web service [1] committed as
> part of the tika-server module. You could probably wire DIH into it and
accomplish the same thing.
> 
> Cheers,
> Chris
> 
> [1] https://issues.apache.org/jira/browse/TIKA-593
> 
> On Feb 24, 2011, at 12:42 PM, jo wrote:
> 
> > 
> > I have tried the steps indicated here:
> > http://wiki.apache.org/solr/ExtractingRequestHandler
> > http://wiki.apache.org/solr/ExtractingRequestHandler 
> > 
> > and when I try to parse a document nothing would happen, no error.. I have
> > copied the jar files everywhere, and nothing.. can anyone give me the steps
> > on how to upgrade just tika, btw, currently on 1.4.1 has tika 0.4
> > 
> > thank you
> > 
> > 
> > -- 
> > View this message in context:
http://lucene.472066.n3.nabble.com/upgrading-to-Tika-0-9-on-Solr-1-4-1-tp2570526p2570526.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
> 
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Senior Computer Scientist
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 171-266B, Mailstop: 171-246
> Email: chris.a.mattmann <at> nasa.gov
> WWW:   http://sunset.usc.edu/~mattmann/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Adjunct Assistant Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Hey Chris

I have added tika-core 0.9 and tika-parsers 0.9 to Solr1.4.1 (extraction/lib)
after building them using the source provided by TIKA. Now I have an issue with
this. I am working with extracting PDF content using Solr. I have added
fmap.content to the configurable params as "attr_content" where I can see the
entire extracted document. After the TIKA update i am not able to see
attr_content appearing in the search results. When I restore it with old 0.4
TIKA jars again the attr_content appears. I didn't find any exceptions shown up
there in the console. Is this a known behavior that someone have faced already?
Can you guide me to resolve this?

-- Surendra





Reply via email to