[ 
https://issues.apache.org/jira/browse/COMPRESS-666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17820416#comment-17820416
 ] 

Gary D. Gregory edited comment on COMPRESS-666 at 2/24/24 8:31 PM:
-------------------------------------------------------------------

Hello [~cosmin79] 

Thank you for your report.

The example you provided does not compile. Note that you must use Java 8. If 
you can provide something we can run on Java 8, we can debug it.

Your best bet is to provide a PR on GitHub.


was (Author: garydgregory):
Hello [~cosmin79] 

Thank you for your report.

The example you provided does not compile. Note that you must use Java 8. If 
you can provide something we can run on Java 8, we can debug it.

 

 

> Commons compress 1.26.0 gives unexpected Corrupted TAR archive
> --------------------------------------------------------------
>
>                 Key: COMPRESS-666
>                 URL: https://issues.apache.org/jira/browse/COMPRESS-666
>             Project: Commons Compress
>          Issue Type: Bug
>         Environment: Commons compress 1.26.0 to get a failure. Any tar tgz.
>            Reporter: Cosmin Carabet
>            Priority: Major
>
> Something in 
> [https://github.com/apache/commons-compress/compare/rel/commons-compress-1.25.0...master]
>  seems to make iterating through the tar entries of multiple 
> TarArchiveInputStreams throw Corrupted TAR archive:
>  
> {code:java}
> @Test
> void bla() {
>     ExecutorService executorService = Executors.newFixedThreadPool(10);
>     List<CompletableFuture<Void>> tasks = IntStream.range(0, 200)
>             .mapToObj(_idx -> CompletableFuture.runAsync(
>                     () -> {
>                         try (InputStream inputStream = this.getClass()
>                                         .getResourceAsStream(
>                                                 "/<your favourite tar tgz>");
>                                 TarArchiveInputStream tarInputStream =
>                                         new TarArchiveInputStream(new 
> GZIPInputStream(inputStream))) {
>                             TarArchiveEntry tarEntry;
>                             while ((tarEntry = 
> tarInputStream.getNextTarEntry()) != null) {
>                                 System.out.println("Reading entry %s with 
> size %d"
>                                         .formatted(tarEntry.getName(), 
> tarEntry.getSize()));
>                             }
>                         } catch (Exception ex) {
>                             throw new SafeRuntimeException(ex);
>                         }
>                     },
>                     executorService))
>             .toList();
>     
> Futures.getUnchecked(CompletableFuture.allOf(verificationTasks.toArray(new 
> CompletableFuture<?>[0])));
> } {code}
> Although TarArchiveInputStream is marked as not thread safe, I am not reusing 
> objects here. Those are in fact separate objects, presumably all with their 
> own position tracking info.
>  
> The stacktrace here looks like:
> {code:java}
> Caused by: java.io.IOException: Corrupted TAR archive.
>     at 
> org.apache.commons.compress.archivers.tar.TarArchiveEntry.parseTarHeader(TarArchiveEntry.java:1480)
>     at 
> org.apache.commons.compress.archivers.tar.TarArchiveEntry.<init>(TarArchiveEntry.java:534)
>     at 
> org.apache.commons.compress.archivers.tar.TarArchiveInputStream.getNextTarEntry(TarArchiveInputStream.java:431)
>     at
> Caused by: java.lang.IllegalArgumentException: Invalid byte 100 at offset 0 
> in 'dddddddddddd' len=12
>     at 
> org.apache.commons.compress.archivers.tar.TarUtils.parseOctal(TarUtils.java:516)
>     at 
> org.apache.commons.compress.archivers.tar.TarUtils.parseOctalOrBinary(TarUtils.java:540)
>     at 
> org.apache.commons.compress.archivers.tar.TarArchiveEntry.parseTarHeaderUnwrapped(TarArchiveEntry.java:1496)
>     at 
> org.apache.commons.compress.archivers.tar.TarArchiveEntry.parseTarHeader(TarArchiveEntry.java:1478)
>     ... 7 more
>  {code}
> That code shows that occasionally the header is wrong (the tar entry name 
> contains gibberish bits) which makes me think that `getNextTarEntry()` can be 
> faulty.
>  
> Running that code with commons compress 1.25.0 works as expected. So it's 
> probably something added since November. Note that this is something related 
> to parallelism - using an executor service with a single thread doesn't 
> suffer from the same error. The tgz to decompress doesn't really matter - you 
> can use a manually created one worth a few KBs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to