Tim Allison created TIKA-4536:
---------------------------------
Summary: Discuss: Consider requiring TikaInputStream for parsers
and detectors
Key: TIKA-4536
URL: https://issues.apache.org/jira/browse/TIKA-4536
Project: Tika
Issue Type: Improvement
Reporter: Tim Allison
On 4.x, we changed all embedded file processing to use TikaInputStream. This
made a bunch of code a lot simpler.
We could simplify code dramatically if we required TikaInputStream in parsers
and detectors.
Caching to disk is no longer a bottleneck for many users, and users can choose
to limit InputStreams by wrapping their InputStreams in a BoundedInputStream or
simiar:
This would be a major breaking change, but would clean up a lot of code.
WDYT?
--
This message was sent by Atlassian Jira
(v8.20.10#820010)