So this might be moot, because it seems that TikaInputStream is already doing
some magic and I’m not sure how.
I was able to re-use the stream without doing anything special after a call to
parse. And in fact, I displayed stream.available() and stream.position()
before and after the call to par
I just found the RereadableInputStream. This looks more like what I was
thinking. Is there any reason not to use it? What are the Tika best
practices? Pros/Cons of each approach? If RereadableInputStream works as it’s
supposed to, I’m not sure I see the advantage of InputStreamFactory
From
On Tue, 23 Feb 2021, Peter Kronenberg wrote:
I was re-reading some emails with Nick Burch back around Dec 22-23 and
maybe I mis-understood him, but it sounds like he was saying that
TiksInputStream was smart enough to automatically spool the stream to
disk to allow re-use.
If a parser knows i