Hm, I see where it is throwing the exception. Would you create a Jira ticket for this feature request and attach at least one example gz file and a failing JUnit test?
TY, Gary On Tue, Aug 15, 2023, 12:31 PM Tim Allison <talli...@apache.org> wrote: > Gary, > > I'm sorry for my delay. I'm just back to the keyboard from some time away. > > This is an example from the gz stream. We had similar messages from some > bzip2 and xz. > > Caused by: java.io.IOException: Garbage after a valid .gz stream > at > org.apache.commons.compress.compressors.gzip.GzipCompressorInputStream.init(GzipCompressorInputStream.java:240) > at > org.apache.commons.compress.compressors.gzip.GzipCompressorInputStream.read(GzipCompressorInputStream.java:391) > at > org.apache.commons.io.input.ProxyInputStream.read(ProxyInputStream.java:205) > at java.base/java.io > .BufferedInputStream.fill(BufferedInputStream.java:252) > at java.base/java.io > .BufferedInputStream.read1(BufferedInputStream.java:292) > at java.base/java.io > .BufferedInputStream.read(BufferedInputStream.java:351) > at > org.apache.commons.io.input.ProxyInputStream.read(ProxyInputStream.java:205) > > Thank you! > > On 2023/07/29 14:49:23 Gary Gregory wrote: > > Hi Tim, > > > > Do you have a stack trace? Maybe this is an option we can add... > > > > Gary > > > > On Wed, Jul 26, 2023, 3:22 PM Tim Allison <talli...@apache.org> wrote: > > > > > We recently had a request to change our default behavior to turn on > > > processing multiple/concatenated compressor streams for gzip, bzip2, > etc. > > > When we made this change and compared the updated results with our > previous > > > results, we lost quite a few attachments because of the "garbage after > a > > > valid x" exception and because of how we're buffering/digesting the > stream. > > > > > > Is there any way to turn on extraction of concatenated compressor > streams, > > > but have it silently stop reading instead of throwing a garbage > exception? > > > > > > Thank you! > > > > > > Best, > > > > > > Tim > > > > > > > > > [0] https://issues.apache.org/jira/browse/TIKA-4048 > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@commons.apache.org > For additional commands, e-mail: user-h...@commons.apache.org > >