On Tue, 23 Feb 2021, Peter Kronenberg wrote:
I was re-reading some emails with Nick Burch back around Dec 22-23 and maybe I mis-understood him, but it sounds like he was saying that TiksInputStream was smart enough to automatically spool the stream to disk to allow re-use.
If a parser knows it is going to need to have a File, or knows it will need to re-read multiple times, it can tell TikaInputStream which will save to a temp file. If you as the caller know this, you can force it with a getFile / getPath call
If spooling to a local file is expensive, but restarting the stream reading is cheap, then the InputStreamFactory can be used instead. Typically that's with cloud storage or the like
Nick
