RE: Re-using a TikaStream

2021-02-25 Thread Peter Kronenberg
Or reading from the cloud, either Google or AWS, in which case I also get a stream. I know what the file name is, but can’t really use it From: Peter Kronenberg Sent: Thursday, February 25, 2021 11:19 AM To: talli...@apache.org Cc: lfcnas...@gmail.com; user@tika.apache.org Subject: RE:

Re: Re-using a TikaStream

2021-02-25 Thread Tim Allison
Are you initializing w a file or a stream? On Thu, Feb 25, 2021 at 9:00 AM Peter Kronenberg wrote: > But how is TikaInputStream allowing me to re-use the stream without me > doing anything special? Is it automatically spooling to disk as needed? > > > > I wouldn’t say that I can’t afford to

RE: Re-using a TikaStream

2021-02-25 Thread Peter Kronenberg
But how is TikaInputStream allowing me to re-use the stream without me doing anything special? Is it automatically spooling to disk as needed? I wouldn’t say that I can’t afford to spool to disk. I’m just looking for the most reasonable solution. I don’t know how big the streams are that

Re: Re-using a TikaStream

2021-02-25 Thread Tim Allison
My $0.02 would be to use TikaInputStream because that gets a lot more use and is battle-tested. Within the last year or so, we started using RereadableInputStream in one of the Microsoft format parsers so it is also getting some use now. If you absolutely can't afford to spool to disk, then give