We recently had a request to change our default behavior to turn on
processing multiple/concatenated compressor streams for gzip, bzip2, etc.
When we made this change and compared the updated results with our previous
results, we lost quite a few attachments because of the "garbage after a
valid x" exception and because of how we're buffering/digesting the stream.
Is there any way to turn on extraction of concatenated compressor streams,
but have it silently stop reading instead of throwing a garbage exception?
Thank you!
Best,
Tim
[0] https://issues.apache.org/jira/browse/TIKA-4048