RE: Re-using a TikaStream

2021-02-23 Thread Peter Kronenberg
So this might be moot, because it seems that TikaInputStream is already doing some magic and I’m not sure how. I was able to re-use the stream without doing anything special after a call to parse. And in fact, I displayed stream.available() and stream.position() before and after the call to

RE: Re-using a TikaStream

2021-02-23 Thread Peter Kronenberg
I just found the RereadableInputStream. This looks more like what I was thinking. Is there any reason not to use it? What are the Tika best practices? Pros/Cons of each approach? If RereadableInputStream works as it’s supposed to, I’m not sure I see the advantage of InputStreamFactory

RE: Re-using a TikaStream

2021-02-23 Thread Nick Burch
On Tue, 23 Feb 2021, Peter Kronenberg wrote: I was re-reading some emails with Nick Burch back around Dec 22-23 and maybe I mis-understood him, but it sounds like he was saying that TiksInputStream was smart enough to automatically spool the stream to disk to allow re-use. If a parser knows