Hi Hayden, Thanks a ton! Yep I think TikaJAXRS will be a viable option for remote tika extraction.
Let me know how I can help. Thanks much! Cheers, Chris On Jul 20, 2012, at 10:13 AM, Mr Havercamp wrote: > Hi Chris > > Thanks for the reply. I will check it out and let you know how I go. > > I am developing an extension for Joomla which uses Solr and Tika to index > content and attachments. I have three configuration options for users to > select when specifying a method to extract content and metadata from files; a > local install of the tika app, SolrCell, or a remote tika server. In your > opinion, would TikaJAXRS be a viable option for remote tika extraction (for > example, running on a separate server) especially in regards to performance > and security? > > Thanks again > > > Hayden > > On 20/07/12 23:30, Mattmann, Chris A (388J) wrote: >> Hi Hayden, >> >> Thanks for your email! Have you tried the Tika JAXRS server, documented here: >> >> https://issues.apache.org/jira/browse/TIKA-593 >> http://wiki.apache.org/tika/TikaJAXRS >> >> It first appeared in 1.2 and can also be run on a port (9988 by default) >> to handle cURL interactions. >> >> Cheers, >> Chris >> >> On Jul 20, 2012, at 8:17 AM, Mr Havercamp wrote: >> >>> Have been playing around with integrating Tika into my PHP app. >>> >>> I have had great success with Tika on the command line and also SolrCell. >>> >>> However, I was wondering if there is some way of running Tika in server >>> mode and extracting a document, say, via CURL. >>> >>> I have had varying degrees of success with: >>> >>> nc localhost 30000 < >>> /opt/lampp/htdocs/joomla25/tmp/InformationRepository.pdf >>> >>> but I'm wondering how I pass other params such as for extracting just >>> metadata or content in html format. >>> >>> Any help would be much appreciated. >>> >>> Cheers >>> >>> >>> Hayden >> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> Chris Mattmann, Ph.D. >> Senior Computer Scientist >> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA >> Office: 171-266B, Mailstop: 171-246 >> Email: [email protected] >> WWW: http://sunset.usc.edu/~mattmann/ >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> Adjunct Assistant Professor, Computer Science Department >> University of Southern California, Los Angeles, CA 90089 USA >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: [email protected] WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
