[ https://issues.apache.org/jira/browse/COMPRESS-679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17846750#comment-17846750 ]
Gary D. Gregory edited comment on COMPRESS-679 at 5/15/24 8:32 PM: ------------------------------------------------------------------- [~mikael_mechoulam] Thank you for your report. Fixed in git master and snapshot builds in https://repository.apache.org/content/repositories/snapshots/org/apache/commons/commons-compress/1.26.2-SNAPSHOT/ Please test and let us know. was (Author: garydgregory): [~mikael_mechoulam] Thank you for your report. Fixed in git master and snapshot builds in https://repository.apache.org/content/repositories/snapshots/org/apache/commons/commons-compress/1.26.2-SNAPSHOT/ Please give test and let us know. > Regression on parallel processing of 7zip files > ----------------------------------------------- > > Key: COMPRESS-679 > URL: https://issues.apache.org/jira/browse/COMPRESS-679 > Project: Commons Compress > Issue Type: Bug > Affects Versions: 1.26.0, 1.26.1 > Reporter: Mikaël MECHOULAM > Assignee: Gary D. Gregory > Priority: Critical > Fix For: 1.26.2 > > Attachments: file.7z > > > I've run into a bug which occurs when attempting to read a 7zip file in > several threads simultaneously. The following code illustrates the problem. > The file.7z is in attachment > > {code:java} > import java.io.InputStream; > import java.nio.file.Paths; > import java.util.stream.IntStream; > import org.apache.commons.compress.archivers.sevenz.SevenZArchiveEntry; > import org.apache.commons.compress.archivers.sevenz.SevenZFile; > public class TestZip { > public static void main(final String[] args) { > final Runnable runnable = () -> { > try { > try (final SevenZFile sevenZFile = > SevenZFile.builder().setPath(Paths.get("file.7z")).get()) { > SevenZArchiveEntry sevenZArchiveEntry; > while ((sevenZArchiveEntry = sevenZFile.getNextEntry()) > != null) { > if ("file4.txt".equals(sevenZArchiveEntry.getName())) > { // The entry must not be the first of the ZIP archive to reproduce > final InputStream inputStream = > sevenZFile.getInputStream(sevenZArchiveEntry); > // treatments... > break; > } > } > } > } catch (final Exception e) { // java.io.IOException: Checksum > verification failed > e.printStackTrace(); > } > }; > IntStream.range(0, 30).forEach(i -> new Thread(runnable).start()); > } > } > {code} > Below is the output I receive on version 1.26: > > {code:java} > java.io.IOException: Checksum verification failed > at > org.apache.commons.compress.utils.ChecksumVerifyingInputStream.verify(ChecksumVerifyingInputStream.java:98) > at > org.apache.commons.compress.utils.ChecksumVerifyingInputStream.read(ChecksumVerifyingInputStream.java:92) > at org.apache.commons.io.IOUtils.skip(IOUtils.java:2422) > at org.apache.commons.io.IOUtils.skip(IOUtils.java:2380) > at > org.apache.commons.compress.archivers.sevenz.SevenZFile.getCurrentStream(SevenZFile.java:912) > at > org.apache.commons.compress.archivers.sevenz.SevenZFile.getInputStream(SevenZFile.java:988) > at > com.infotel.arcsys.nativ.archiving.zip.TestZip.lambda$main$0(TestZip.java:21) > at java.base/java.lang.Thread.run(Thread.java:833) > > {code} > The issue seems to arise from the transition from version 1.25 to 1.26 of > Apache Commons Compress. In the {{SevenZFile}} class of the library, the > private method {{getCurrentStream}} has migrated from > {{IOUtils.skip(InputStream, long)}} to a method with a same signature but in > Commons-IO package, which leads to a change in behavior. In version 1.26, it > uses a shared and unsynchronized buffer, theoretically intended only for > writing ({{{}SCRATCH_BYTE_BUFFER_WO{}}}). This causes checksum verification > issues within the library. The problem seems to be resolved by specifying the > {{Supplier}} of the buffer to use. > {code:java} > try (InputStream stream = deferredBlockStreams.remove(0)) { > org.apache.commons.io.IOUtils.skip(stream, Long.MAX_VALUE, () -> new > byte[org.apache.commons.io.IOUtils.DEFAULT_BUFFER_SIZE]); > } {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)