[jira] [Resolved] (COMPRESS-679) Regression on parallel processing of 7zip files

Gary D. Gregory (Jira) Wed, 15 May 2024 13:33:05 -0700


     [ 
https://issues.apache.org/jira/browse/COMPRESS-679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Gary D. Gregory resolved COMPRESS-679.
--------------------------------------
    Fix Version/s: 1.26.2
       Resolution: Fixed

[~mikael_mechoulam]

Thank you for your report.

Fixed in git master and snapshot builds in 
https://repository.apache.org/content/repositories/snapshots/org/apache/commons/commons-compress/1.26.2-SNAPSHOT/

Please give test and let us know.


> Regression on parallel processing of 7zip files
> -----------------------------------------------
>
>                 Key: COMPRESS-679
>                 URL: https://issues.apache.org/jira/browse/COMPRESS-679
>             Project: Commons Compress
>          Issue Type: Bug
>    Affects Versions: 1.26.0, 1.26.1
>            Reporter: Mikaël MECHOULAM
>            Assignee: Gary D. Gregory
>            Priority: Critical
>             Fix For: 1.26.2
>
>         Attachments: file.7z
>
>
> I've run into a bug which occurs when attempting to read a 7zip file in 
> several threads simultaneously.  The following code illustrates the problem. 
> The file.7z is in attachment
>  
> {code:java}
> import java.io.InputStream;
> import java.nio.file.Paths;
> import java.util.stream.IntStream;
> import org.apache.commons.compress.archivers.sevenz.SevenZArchiveEntry;
> import org.apache.commons.compress.archivers.sevenz.SevenZFile;
> public class TestZip {
>     public static void main(final String[] args) {
>         final Runnable runnable = () -> {
>             try {
>                 try (final SevenZFile sevenZFile = 
> SevenZFile.builder().setPath(Paths.get("file.7z")).get()) {
>                     SevenZArchiveEntry sevenZArchiveEntry;
>                     while ((sevenZArchiveEntry = sevenZFile.getNextEntry()) 
> != null) {
>                         if ("file4.txt".equals(sevenZArchiveEntry.getName())) 
> { // The entry must not be the first of the ZIP archive to reproduce
>                             final InputStream inputStream = 
> sevenZFile.getInputStream(sevenZArchiveEntry);
>                             // treatments...
>                             break;
>                         }
>                     }
>                 }
>             } catch (final Exception e) { // java.io.IOException: Checksum 
> verification failed
>                 e.printStackTrace();
>             }
>         };
>         IntStream.range(0, 30).forEach(i -> new Thread(runnable).start());
>     }
> }
> {code}
> Below is the output I receive on version 1.26: 
>  
> {code:java}
> java.io.IOException: Checksum verification failed
>   at 
> org.apache.commons.compress.utils.ChecksumVerifyingInputStream.verify(ChecksumVerifyingInputStream.java:98)
>   at 
> org.apache.commons.compress.utils.ChecksumVerifyingInputStream.read(ChecksumVerifyingInputStream.java:92)
>   at org.apache.commons.io.IOUtils.skip(IOUtils.java:2422)
>   at org.apache.commons.io.IOUtils.skip(IOUtils.java:2380)
>   at 
> org.apache.commons.compress.archivers.sevenz.SevenZFile.getCurrentStream(SevenZFile.java:912)
>   at 
> org.apache.commons.compress.archivers.sevenz.SevenZFile.getInputStream(SevenZFile.java:988)
>   at 
> com.infotel.arcsys.nativ.archiving.zip.TestZip.lambda$main$0(TestZip.java:21)
>   at java.base/java.lang.Thread.run(Thread.java:833)
>  
> {code}
> The issue seems to arise from the transition from version 1.25 to 1.26 of 
> Apache Commons Compress. In the {{SevenZFile}} class of the library, the 
> private method {{getCurrentStream}} has migrated from 
> {{IOUtils.skip(InputStream, long)}} to a method with a same signature but in 
> Commons-IO package, which leads to a change in behavior. In version 1.26, it 
> uses a shared and unsynchronized buffer, theoretically intended only for 
> writing ({{{}SCRATCH_BYTE_BUFFER_WO{}}}). This causes checksum verification 
> issues within the library. The problem seems to be resolved by specifying the 
> {{Supplier}} of the buffer to use.
> {code:java}
> try (InputStream stream = deferredBlockStreams.remove(0)) {
>     org.apache.commons.io.IOUtils.skip(stream, Long.MAX_VALUE, () -> new 
> byte[org.apache.commons.io.IOUtils.DEFAULT_BUFFER_SIZE]);
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (COMPRESS-679) Regression on parallel processing of 7zip files

Reply via email to