Hi Hayden,

Thanks a ton! Yep I think TikaJAXRS will be a viable option for remote tika 
extraction.

Let me know how I can help.

Thanks much!

Cheers,
Chris

On Jul 20, 2012, at 10:13 AM, Mr Havercamp wrote:

> Hi Chris
> 
> Thanks for the reply. I will check it out and let you know how I go.
> 
> I am developing an extension for Joomla which uses Solr and Tika to index 
> content and attachments. I have three configuration options for users to 
> select when specifying a method to extract content and metadata from files; a 
> local install of the tika app, SolrCell, or a remote tika server. In your 
> opinion, would TikaJAXRS be a viable option for remote tika extraction (for 
> example, running on a separate server) especially in regards to performance 
> and security?
> 
> Thanks again
> 
> 
> Hayden
> 
> On 20/07/12 23:30, Mattmann, Chris A (388J) wrote:
>> Hi Hayden,
>> 
>> Thanks for your email! Have you tried the Tika JAXRS server, documented here:
>> 
>> https://issues.apache.org/jira/browse/TIKA-593
>> http://wiki.apache.org/tika/TikaJAXRS
>> 
>> It first appeared in 1.2 and can also be run on a port (9988 by default)
>> to handle cURL interactions.
>> 
>> Cheers,
>> Chris
>> 
>> On Jul 20, 2012, at 8:17 AM, Mr Havercamp wrote:
>> 
>>> Have been playing around with integrating Tika into my PHP app.
>>> 
>>> I have had great success with Tika on the command line and also SolrCell.
>>> 
>>> However, I was wondering if there is some way of running Tika in server 
>>> mode and extracting a document, say, via CURL.
>>> 
>>> I have had varying degrees of success with:
>>> 
>>> nc localhost 30000 < 
>>> /opt/lampp/htdocs/joomla25/tmp/InformationRepository.pdf
>>> 
>>> but I'm wondering how I pass other params such as for extracting just 
>>> metadata or content in html format.
>>> 
>>> Any help would be much appreciated.
>>> 
>>> Cheers
>>> 
>>> 
>>> Hayden
>> 
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Chris Mattmann, Ph.D.
>> Senior Computer Scientist
>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>> Office: 171-266B, Mailstop: 171-246
>> Email: [email protected]
>> WWW:   http://sunset.usc.edu/~mattmann/
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Adjunct Assistant Professor, Computer Science Department
>> University of Southern California, Los Angeles, CA 90089 USA
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> 
> 


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: [email protected]
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Reply via email to