Re: Unable to extract content from chunked portion of large file

2016-02-24 Thread raghu vittal
Hi Ken, Thanks for the reply. i understood your point. what i have tried. > byte[] srcBytes = File.ReadAllBytes(filePath); > get the chunk of 1 MB out of srcBytes > when i pass this 1 MB chunk to Tika it is giving me the error. > As the WIKI Tika needs the entire file to extract con

Re: Tika Wiki Login

2016-02-24 Thread Sergey Beryozkin
Hi Chris, thanks, can't reply yet the issue is done :-) but will take care of it, Thanks, Sergey On 24/02/16 17:05, Mattmann, Chris A (3980) wrote: permission granted :) ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software

Re: Unable to extract content from chunked portion of large file

2016-02-24 Thread Mattmann, Chris A (3980)
yayyy! ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW:

Re: Tika Wiki Login

2016-02-24 Thread Mattmann, Chris A (3980)
permission granted :) ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...

Re: Unable to extract content from chunked portion of large file

2016-02-24 Thread Sergey Beryozkin
Time to start contributing to Tika again :-) Cheers, Sergey On 24/02/16 17:01, Mattmann, Chris A (3980) wrote: thanks mucho my friend ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (39

Tika Wiki Login

2016-02-24 Thread Sergey Beryozkin
Hi Chris Can you please give me the rights to edit the wiki, I have all the docs signed. I can edit CXF and Camel wikis with a 'sergey_beryozkin' login, thought could do the same with Tika Thanks, Sergey

Re: Unable to extract content from chunked portion of large file

2016-02-24 Thread Mattmann, Chris A (3980)
thanks mucho my friend ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm..

Re: Unable to extract content from chunked portion of large file

2016-02-24 Thread Sergey Beryozkin
Hi Chris Sure, I've opened https://issues.apache.org/jira/browse/TIKA-1871 and assigned to myself, will add some info about multipart/form-data asap Cheers, Sergey On 24/02/16 16:40, Mattmann, Chris A (3980) wrote: +1 please just remove it from the wiki since it clearly supports that per you

Re: Unable to extract content from chunked portion of large file

2016-02-24 Thread Mattmann, Chris A (3980)
+1 please just remove it from the wiki since it clearly supports that per your research thanks Sergey! ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory

RE: Unable to extract content from chunked portion of large file

2016-02-24 Thread Ken Krugler
Hi Sergey, Thanks for digging into the code - I'd seen the docs and assumed it wouldn't work. Anybody have a chance to give that a try? Maybe Raghu? :) -- Ken > From: Sergey Beryozkin > Sent: February 24, 2016 7:44:13am PST > To: user@tika.apache.org > Subject: Re: Unable to extract content fr

Re: Unable to extract content from chunked portion of large file

2016-02-24 Thread Sergey Beryozkin
Hi All If a large file is passed to a Tika server as a multipart/form payload then CXF will be creating a temp file on the disk itself. Hmm... I was looking for a reference to it and I found the advice not to use multipart/form-data: https://wiki.apache.org/tika/TikaJAXRS (in Services) I belie

RE: Unable to extract content from chunked portion of large file

2016-02-24 Thread Ken Krugler
Hi Raghu, I don't think you understood what I was proposing. I suggested creating a service that could receive chunks of the file (persisted to local disk). Then this service could implement an input stream class that would read sequentially from these pieces. This input stream would be passed

Re: Jackson & Fat tika-server jar question

2016-02-24 Thread John Patrick
Cheers for replies, I now understand how tika developers intended tika-server should be used but for the custom code we have written we need to use a few classes that only live in tika-server. For Jackson I've done that as a seperate pull request. tika-server restructure https://issues.apache.org

Re: Unable to extract content from chunked portion of large file

2016-02-24 Thread raghu vittal
Thanks for your reply. In our application user can upload large files. Our intention is to extract the content out of large file and dump that in Elastic for contented based search. we have > 300 MB size .xlsx and .doc files. sending that large file to Tika will causing timeout issues. i tri