[jira] [Resolved] (COMPRESS-564) Support ZSTD JNI BufferPool

2023-03-20 Thread Peter Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/COMPRESS-564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Lee resolved COMPRESS-564.

Fix Version/s: 1.21
 Assignee: Peter Lee
   Resolution: Fixed

> Support ZSTD JNI BufferPool
> ---
>
> Key: COMPRESS-564
> URL: https://issues.apache.org/jira/browse/COMPRESS-564
> Project: Commons Compress
>  Issue Type: New Feature
>Reporter: Michael Heuer
>Assignee: Peter Lee
>Priority: Major
> Fix For: 1.21
>
>
> commons-compress should allow configuration of zstd-jni's use of 
> RecyclingBufferPool vs NoPool.  zstd-jni defaults to not pool buffers by 
> default
> [https://github.com/luben/zstd-jni/commit/f7c8279bc162c8c8b1964948d0f3b309ad715311]
>  
> Please see pull requests for similar issues in Apache Spark
> [https://github.com/apache/spark/pull/31453]
> and Apache Parquet projects
> [https://github.com/apache/parquet-mr/pull/865]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (COMPRESS-571) 7z random access fails on shuffled entry list

2023-03-20 Thread Peter Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/COMPRESS-571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Lee updated COMPRESS-571:
---
Fix Version/s: 1.21

> 7z random access fails on shuffled entry list
> -
>
> Key: COMPRESS-571
> URL: https://issues.apache.org/jira/browse/COMPRESS-571
> Project: Commons Compress
>  Issue Type: Bug
>Affects Versions: 1.20
>Reporter: Robin Schimpf
>Assignee: Peter Lee
>Priority: Major
> Fix For: 1.21
>
>
> I noticed a problem on a 7z file and could reproduce the error if the 
> InputStream is retrieved after shuffling the entries.
> This test fails with a checksum verification error
> {code:java}
> @Test
> public void retrieveInputStreamForShuffledEntries() throws IOException {
> try (final SevenZFile sevenZFile = new 
> SevenZFile(getFile("COMPRESS-256.7z"))) {
> List entries = (List) 
> sevenZFile.getEntries();
> Collections.shuffle(entries);
> for (final SevenZArchiveEntry entry : entries) {
> IOUtils.toByteArray(sevenZFile.getInputStream(entry));
> }
> }
> }
> {code}
> This is the exception
> {code:java}
> java.io.IOException: Checksum verification failed
>   at 
> org.apache.commons.compress.utils.ChecksumVerifyingInputStream.read(ChecksumVerifyingInputStream.java:94)
>   at 
> org.apache.commons.compress.utils.ChecksumVerifyingInputStream.read(ChecksumVerifyingInputStream.java:74)
>   at org.apache.commons.compress.utils.IOUtils.copy(IOUtils.java:87)
>   at org.apache.commons.compress.utils.IOUtils.copy(IOUtils.java:62)
>   at 
> org.apache.commons.compress.utils.IOUtils.toByteArray(IOUtils.java:247)
>   at 
> org.apache.commons.compress.archivers.sevenz.SevenZFileTest.retrieveInputStreamForShuffledEntries(SevenZFileTest.java:616)
> {code}
> This also fails on the current master with the same error



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (COMPRESS-571) 7z random access fails on shuffled entry list

2023-03-20 Thread Peter Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/COMPRESS-571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Lee resolved COMPRESS-571.

Resolution: Fixed

> 7z random access fails on shuffled entry list
> -
>
> Key: COMPRESS-571
> URL: https://issues.apache.org/jira/browse/COMPRESS-571
> Project: Commons Compress
>  Issue Type: Bug
>Affects Versions: 1.20
>Reporter: Robin Schimpf
>Assignee: Peter Lee
>Priority: Major
> Fix For: 1.21
>
>
> I noticed a problem on a 7z file and could reproduce the error if the 
> InputStream is retrieved after shuffling the entries.
> This test fails with a checksum verification error
> {code:java}
> @Test
> public void retrieveInputStreamForShuffledEntries() throws IOException {
> try (final SevenZFile sevenZFile = new 
> SevenZFile(getFile("COMPRESS-256.7z"))) {
> List entries = (List) 
> sevenZFile.getEntries();
> Collections.shuffle(entries);
> for (final SevenZArchiveEntry entry : entries) {
> IOUtils.toByteArray(sevenZFile.getInputStream(entry));
> }
> }
> }
> {code}
> This is the exception
> {code:java}
> java.io.IOException: Checksum verification failed
>   at 
> org.apache.commons.compress.utils.ChecksumVerifyingInputStream.read(ChecksumVerifyingInputStream.java:94)
>   at 
> org.apache.commons.compress.utils.ChecksumVerifyingInputStream.read(ChecksumVerifyingInputStream.java:74)
>   at org.apache.commons.compress.utils.IOUtils.copy(IOUtils.java:87)
>   at org.apache.commons.compress.utils.IOUtils.copy(IOUtils.java:62)
>   at 
> org.apache.commons.compress.utils.IOUtils.toByteArray(IOUtils.java:247)
>   at 
> org.apache.commons.compress.archivers.sevenz.SevenZFileTest.retrieveInputStreamForShuffledEntries(SevenZFileTest.java:616)
> {code}
> This also fails on the current master with the same error



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (COMPRESS-604) 1.21 with java 8 compatibility issue error: cannot access SeekableByteChannel ZipArchiveOutputStream zipOutputStream = new ZipArchiveOutputStream(response.getOutputS

2022-03-10 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17504675#comment-17504675
 ] 

Peter Lee commented on COMPRESS-604:


Hi [~anupamamanish.s] .

Could you have a try if [~anupamamanish.s]'s fix work?

>  1.21 with java 8 compatibility issue error: cannot access 
> SeekableByteChannel ZipArchiveOutputStream zipOutputStream = new 
> ZipArchiveOutputStream(response.getOutputStream()); class file for 
> java.nio.channels.SeekableByteChannel not found
> --
>
> Key: COMPRESS-604
> URL: https://issues.apache.org/jira/browse/COMPRESS-604
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Affects Versions: 1.21
>Reporter: Anupama Shinde
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> error: cannot access SeekableByteChannel
> ZipArchiveOutputStream zipOutputStream = new 
> ZipArchiveOutputStream(response.getOutputStream());
> class file for java.nio.channels.SeekableByteChannel not found
>  
>  
> So origianl issue was 
> [https://bugzilla.redhat.com/show_bug.cgi?id=CVE-2021-36090]
> We are trying to upgrade to common-compress-1.21. But it seems it is not 
> compatible with java 8



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (COMPRESS-608) Corrupt Tar decompression fails to find all available entries

2022-02-09 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17489475#comment-17489475
 ] 

Peter Lee commented on COMPRESS-608:


I checked the attached pax-bad-hdr-file.tar, and it's related to 
COMPRESS-575(commit 61615b31).

This commit was to harden the operations for some edge cases that we didn't do 
it well enough before. I think the current version of handling is a better one.

 

I'm not sure if this is a breaking change or not(1.20 could get the available 
entries but 1.21 could not). Considering your attached tarballs are corrupted 
tars, IMO this change is not a breaking change.

> Corrupt Tar decompression fails to find all available entries
> -
>
> Key: COMPRESS-608
> URL: https://issues.apache.org/jira/browse/COMPRESS-608
> Project: Commons Compress
>  Issue Type: Bug
>Affects Versions: 1.21
>Reporter: Madhu Gopanna
>Priority: Major
> Attachments: pax-bad-hdr-file.tar, sparse-formats.tar, 
> writer-big-long.tar
>
>
> Attached are three Tars that fail to find all available entries (contents) 
> and exit with an IO Exception. This change in behavior maybe related to 
> specific change log in 1.21 when compared to 1.20 behavior for the same Tars.
>  # Tar:  
> /gcc-8-8.3.0/gcc-8.3.0.tar.xz!/gcc-8.3.0/libgo/go/archive/tar/testdata/pax-bad-hdr-file.tar
>  ## Missing:
>  ### PAX1
>  ### PAX1/PAX1
>  ### /PAX1/PAX1/long-path-name
>  ## Not missing: N/A
>  ## Exception: 
> {code:java}
> java.io.IOException: Failed to read Paxheader.Value should end with a newline
>     at 
> org.apache.commons.compress.archivers.tar.TarUtils.parsePaxHeaders(TarUtils.java:769)
>     at 
> org.apache.commons.compress.archivers.tar.TarArchiveInputStream.paxHeaders(TarArchiveInputStream.java:605)
>     at 
> org.apache.commons.compress.archivers.tar.TarArchiveInputStream.getNextTarEntry(TarArchiveInputStream.java:421)
>     at com.synopsys.sigcompcore.Application.main(Application.java:34) {code}
>  # Tar: 
> /gcc-8-8.3.0/gcc-8.3.0.tar.xz!/gcc-8.3.0/libgo/go/archive/tar/testdata/sparse-formats.tar
>  ## Missing:
>  ### end
>  ### sparse-posix-0.0
>  ### sparse-posix-0.1
>  ### sparse-posix-1.0
>  ## Not missing:
>  ### sparse-gnu
>  ## Exception: 
> {code:java}
> java.io.IOException: Truncated TAR archive
>     at 
> org.apache.commons.compress.archivers.tar.TarArchiveInputStream.read(TarArchiveInputStream.java:743)
>     at org.apache.commons.compress.utils.IOUtils.readFully(IOUtils.java:197)
>     at org.apache.commons.compress.utils.IOUtils.skip(IOUtils.java:129)
>     at 
> org.apache.commons.compress.archivers.tar.TarArchiveInputStream.getNextTarEntry(TarArchiveInputStream.java:364)
>     at com.synopsys.sigcompcore.Application.main(Application.java:35){code}
>  # Tar: 
> /gcc-6-6.1.1/gcc-6.1.0-dfsg.tar!/gcc-6.1.0/libgo/go/archive/tar/testdata/writer-big-long.tar
>  ## Missing: 16gig.txt
>  ## Not missing: N/A
>  ## Exception: 
> {code:java}
> java.io.IOException: Corrupted TAR archive, sparse entry is invalid
>     at 
> org.apache.commons.compress.archivers.tar.TarUtils.readSparseStructs(TarUtils.java:350)
>     at 
> org.apache.commons.compress.archivers.tar.TarArchiveEntry.parseTarHeaderUnwrapped(TarArchiveEntry.java:1656)
>     at 
> org.apache.commons.compress.archivers.tar.TarArchiveEntry.parseTarHeader(TarArchiveEntry.java:1595)
>     at 
> org.apache.commons.compress.archivers.tar.TarArchiveEntry.(TarArchiveEntry.java:556)
>     at 
> org.apache.commons.compress.archivers.tar.TarArchiveInputStream.getNextTarEntry(TarArchiveInputStream.java:379)
>     at com.synopsys.sigcompcore.Application.main(Application.java:37)
> Caused by: java.lang.IllegalArgumentException: Invalid byte 97 at offset 0 in 
> 'ame/longname' len=12
>     at 
> org.apache.commons.compress.archivers.tar.TarUtils.parseOctal(TarUtils.java:153)
>     at 
> org.apache.commons.compress.archivers.tar.TarUtils.parseOctalOrBinary(TarUtils.java:183)
>     at 
> org.apache.commons.compress.archivers.tar.TarUtils.parseSparse(TarUtils.java:324)
>     at 
> org.apache.commons.compress.archivers.tar.TarUtils.readSparseStructs(TarUtils.java:339)
>     ... 5 more{code}
> Expected behavior: Log the IO Exception and exit(1.21 behavior) from 
> corrupted or truncated Tar after decompressing all available Tar entries 
> (1.20 behavior).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (COMPRESS-608) Corrupt Tar decompression fails to find all available entries

2022-02-09 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17489360#comment-17489360
 ] 

Peter Lee commented on COMPRESS-608:


Hi [~gopanna] 

Let me make this clear : these 3 attached tars are corrupted or truncated, and 
you are expecting Compress to extract all the available entries - and that's 
what Compress 1.20 could do. Am I right about this?

> Corrupt Tar decompression fails to find all available entries
> -
>
> Key: COMPRESS-608
> URL: https://issues.apache.org/jira/browse/COMPRESS-608
> Project: Commons Compress
>  Issue Type: Bug
>Affects Versions: 1.21
>Reporter: Madhu Gopanna
>Priority: Major
> Attachments: pax-bad-hdr-file.tar, sparse-formats.tar, 
> writer-big-long.tar
>
>
> Attached are three Tars that fail to find all available entries (contents) 
> and exit with an IO Exception. This change in behavior maybe related to 
> specific change log in 1.21 when compared to 1.20 behavior for the same Tars.
>  # Tar:  
> /gcc-8-8.3.0/gcc-8.3.0.tar.xz!/gcc-8.3.0/libgo/go/archive/tar/testdata/pax-bad-hdr-file.tar
>  ## Missing:
>  ### PAX1
>  ### PAX1/PAX1
>  ### /PAX1/PAX1/long-path-name
>  ## Not missing: N/A
>  ## Exception: 
> {code:java}
> java.io.IOException: Failed to read Paxheader.Value should end with a newline
>     at 
> org.apache.commons.compress.archivers.tar.TarUtils.parsePaxHeaders(TarUtils.java:769)
>     at 
> org.apache.commons.compress.archivers.tar.TarArchiveInputStream.paxHeaders(TarArchiveInputStream.java:605)
>     at 
> org.apache.commons.compress.archivers.tar.TarArchiveInputStream.getNextTarEntry(TarArchiveInputStream.java:421)
>     at com.synopsys.sigcompcore.Application.main(Application.java:34) {code}
>  # Tar: 
> /gcc-8-8.3.0/gcc-8.3.0.tar.xz!/gcc-8.3.0/libgo/go/archive/tar/testdata/sparse-formats.tar
>  ## Missing:
>  ### end
>  ### sparse-posix-0.0
>  ### sparse-posix-0.1
>  ### sparse-posix-1.0
>  ## Not missing:
>  ### sparse-gnu
>  ## Exception: 
> {code:java}
> java.io.IOException: Truncated TAR archive
>     at 
> org.apache.commons.compress.archivers.tar.TarArchiveInputStream.read(TarArchiveInputStream.java:743)
>     at org.apache.commons.compress.utils.IOUtils.readFully(IOUtils.java:197)
>     at org.apache.commons.compress.utils.IOUtils.skip(IOUtils.java:129)
>     at 
> org.apache.commons.compress.archivers.tar.TarArchiveInputStream.getNextTarEntry(TarArchiveInputStream.java:364)
>     at com.synopsys.sigcompcore.Application.main(Application.java:35){code}
>  # Tar: 
> /gcc-6-6.1.1/gcc-6.1.0-dfsg.tar!/gcc-6.1.0/libgo/go/archive/tar/testdata/writer-big-long.tar
>  ## Missing: 16gig.txt
>  ## Not missing: N/A
>  ## Exception: 
> {code:java}
> java.io.IOException: Corrupted TAR archive, sparse entry is invalid
>     at 
> org.apache.commons.compress.archivers.tar.TarUtils.readSparseStructs(TarUtils.java:350)
>     at 
> org.apache.commons.compress.archivers.tar.TarArchiveEntry.parseTarHeaderUnwrapped(TarArchiveEntry.java:1656)
>     at 
> org.apache.commons.compress.archivers.tar.TarArchiveEntry.parseTarHeader(TarArchiveEntry.java:1595)
>     at 
> org.apache.commons.compress.archivers.tar.TarArchiveEntry.(TarArchiveEntry.java:556)
>     at 
> org.apache.commons.compress.archivers.tar.TarArchiveInputStream.getNextTarEntry(TarArchiveInputStream.java:379)
>     at com.synopsys.sigcompcore.Application.main(Application.java:37)
> Caused by: java.lang.IllegalArgumentException: Invalid byte 97 at offset 0 in 
> 'ame/longname' len=12
>     at 
> org.apache.commons.compress.archivers.tar.TarUtils.parseOctal(TarUtils.java:153)
>     at 
> org.apache.commons.compress.archivers.tar.TarUtils.parseOctalOrBinary(TarUtils.java:183)
>     at 
> org.apache.commons.compress.archivers.tar.TarUtils.parseSparse(TarUtils.java:324)
>     at 
> org.apache.commons.compress.archivers.tar.TarUtils.readSparseStructs(TarUtils.java:339)
>     ... 5 more{code}
> Expected behavior: Log the IO Exception and exit(1.21 behavior) from 
> corrupted or truncated Tar after decompressing all available Tar entries 
> (1.20 behavior).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (COMPRESS-603) Expander does not support archives with archive entries beginning with ./

2022-02-08 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17489313#comment-17489313
 ] 

Peter Lee commented on COMPRESS-603:


Fixed with commit 510be7f.

[~mattsicker]  Please have a look if it works.  :)

> Expander does not support archives with archive entries beginning with ./
> -
>
> Key: COMPRESS-603
> URL: https://issues.apache.org/jira/browse/COMPRESS-603
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Affects Versions: 1.21
>Reporter: Matt Sicker
>Priority: Major
> Attachments: test.tar.gz
>
>
> Suppose I create a tar file from a directory like so:
>  
> {code:java}
> tar -cf foo.tar ./foo{code}
> When I try to extract the tar entries using the Expander class, that throws a 
> java.io.IOException: Expanding ./ would create file outside of ...
> When I create the tar file without the leading ./, then Expander doesn't 
> complain.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (COMPRESS-603) Expander does not support archives with archive entries beginning with ./

2022-02-08 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17489299#comment-17489299
 ] 

Peter Lee commented on COMPRESS-603:


Thanks [~mattsicker] 

I can reproduce this like this:
{code:java}
@Test
public void testCompress603Tar() throws IOException, ArchiveException {
setupTarForCompress603();

try (TarFile f = new TarFile(archive)) {
new Expander().expand(f, resultDir);
}
}

private void setupTarForCompress603() throws IOException, ArchiveException {
archive = new File(dir, "test.tar");
final File dummy = new File(dir, "x");
try (OutputStream o = Files.newOutputStream(dummy.toPath())) {
o.write(new byte[14]);
}
try (ArchiveOutputStream aos = ArchiveStreamFactory.DEFAULT
.createArchiveOutputStream("tar", 
Files.newOutputStream(archive.toPath( {
aos.putArchiveEntry(aos.createArchiveEntry(dir, "./"));
aos.closeArchiveEntry();
aos.putArchiveEntry(aos.createArchiveEntry(dir, "./a"));
aos.closeArchiveEntry();
aos.finish();
}
} {code}
 

IMO creating a tar containing an entry with name of "./" a edge case, but I 
have to admit this is legal case - and GNU tar & bsdtar can successfully handle 
this.

 

Will fix this soon.

> Expander does not support archives with archive entries beginning with ./
> -
>
> Key: COMPRESS-603
> URL: https://issues.apache.org/jira/browse/COMPRESS-603
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Affects Versions: 1.21
>Reporter: Matt Sicker
>Priority: Major
> Attachments: test.tar.gz
>
>
> Suppose I create a tar file from a directory like so:
>  
> {code:java}
> tar -cf foo.tar ./foo{code}
> When I try to extract the tar entries using the Expander class, that throws a 
> java.io.IOException: Expanding ./ would create file outside of ...
> When I create the tar file without the leading ./, then Expander doesn't 
> complain.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Comment Edited] (COMPRESS-603) Expander does not support archives with archive entries beginning with ./

2022-02-07 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17488621#comment-17488621
 ] 

Peter Lee edited comment on COMPRESS-603 at 2/8/22, 6:53 AM:
-

I created foo.tar with GNU tar, and the expander works well.

I will try to reproduce this issue with BSD tar.

 

Update:

I'm using bsdtar 3.2.2 to create a tar like this:
{code:java}
bsdtar -cf foo.tar ./test.txt {code}
Still can not reproduce this issue.


was (Author: peterlee):
I created foo.tar with GNU tar, and the expander works well.

I will try to reproduce this issue with BSD tar.

> Expander does not support archives with archive entries beginning with ./
> -
>
> Key: COMPRESS-603
> URL: https://issues.apache.org/jira/browse/COMPRESS-603
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Affects Versions: 1.21
>Reporter: Matt Sicker
>Priority: Major
> Attachments: test.tar.gz
>
>
> Suppose I create a tar file from a directory like so:
>  
> {code:java}
> tar -cf foo.tar ./foo{code}
> When I try to extract the tar entries using the Expander class, that throws a 
> java.io.IOException: Expanding ./ would create file outside of ...
> When I create the tar file without the leading ./, then Expander doesn't 
> complain.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (COMPRESS-603) Expander does not support archives with archive entries beginning with ./

2022-02-07 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17488621#comment-17488621
 ] 

Peter Lee commented on COMPRESS-603:


I created foo.tar with GNU tar, and the expander works well.

I will try to reproduce this issue with BSD tar.

> Expander does not support archives with archive entries beginning with ./
> -
>
> Key: COMPRESS-603
> URL: https://issues.apache.org/jira/browse/COMPRESS-603
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Affects Versions: 1.21
>Reporter: Matt Sicker
>Priority: Major
> Attachments: test.tar.gz
>
>
> Suppose I create a tar file from a directory like so:
>  
> {code:java}
> tar -cf foo.tar ./foo{code}
> When I try to extract the tar entries using the Expander class, that throws a 
> java.io.IOException: Expanding ./ would create file outside of ...
> When I create the tar file without the leading ./, then Expander doesn't 
> complain.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (COMPRESS-603) Expander does not support archives with archive entries beginning with ./

2022-02-07 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17488565#comment-17488565
 ] 

Peter Lee commented on COMPRESS-603:


So this is a tar.gz archive, instead of a tar?

 

I tested with code at tag {_}rel/1.21{_}, and it works well. My test code is :
{code:java}
@Test
public void testCompress603() throws IOException, ArchiveException {
try (final InputStream is = new FileInputStream(getFile("test.tar.gz"))) {
final InputStream gin = new GZIPInputStream(is);
final TarArchiveInputStream tin = new TarArchiveInputStream(gin);
new Expander().expand(tin, resultDir);
}
} {code}

> Expander does not support archives with archive entries beginning with ./
> -
>
> Key: COMPRESS-603
> URL: https://issues.apache.org/jira/browse/COMPRESS-603
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Affects Versions: 1.21
>Reporter: Matt Sicker
>Priority: Major
> Attachments: test.tar.gz
>
>
> Suppose I create a tar file from a directory like so:
>  
> {code:java}
> tar -cf foo.tar ./foo{code}
> When I try to extract the tar entries using the Expander class, that throws a 
> java.io.IOException: Expanding ./ would create file outside of ...
> When I create the tar file without the leading ./, then Expander doesn't 
> complain.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (COMPRESS-603) Expander does not support archives with archive entries beginning with ./

2022-02-06 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17487914#comment-17487914
 ] 

Peter Lee commented on COMPRESS-603:


I tested with the latest code but I can not reproduce this.

My test code :
{code:java}
@Test
public void testCompress603() throws IOException, ArchiveException {
try (TarFile f = new TarFile(getFile("foo.tar"))) {
new Expander().expand(f, resultDir);
}
} {code}
Could you also attach the foo.tar so I can reproduce this? :)

> Expander does not support archives with archive entries beginning with ./
> -
>
> Key: COMPRESS-603
> URL: https://issues.apache.org/jira/browse/COMPRESS-603
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Affects Versions: 1.21
>Reporter: Matt Sicker
>Priority: Major
>
> Suppose I create a tar file from a directory like so:
>  
> {code:java}
> tar -cf foo.tar ./foo{code}
> When I try to extract the tar entries using the Expander class, that throws a 
> java.io.IOException: Expanding ./ would create file outside of ...
> When I create the tar file without the leading ./, then Expander doesn't 
> complain.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (COMPRESS-607) ZipArchiveInputStream: Large STORED entry leads to OutOfMemory

2022-02-06 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17487906#comment-17487906
 ] 

Peter Lee commented on COMPRESS-607:


For archives stored in disk, you can choose ZipFile and ZipArchiveInputStream 
as you needed. But ZipArchiveInputStream is designed to work in memory. IMO a 
4GB limit is reasonable - it's the limit of array size in Java.

 

Seems we need to document this limitation.

> ZipArchiveInputStream: Large STORED entry leads to OutOfMemory
> --
>
> Key: COMPRESS-607
> URL: https://issues.apache.org/jira/browse/COMPRESS-607
> Project: Commons Compress
>  Issue Type: Bug
>Affects Versions: 1.21
>Reporter: Robin Schimpf
>Priority: Major
>
> While extracting a large Zip with only STORED entries a file larger than 4GB 
> triggered an OutOfMemory error without the JVM having exhausted the available 
> memory.
> {code:java}
> Caused by: java.lang.OutOfMemoryError
>         at 
> java.base/java.io.ByteArrayOutputStream.hugeCapacity(ByteArrayOutputStream.java:125)
>         at 
> java.base/java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:119)
>         at 
> java.base/java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:95)
>         at 
> java.base/java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:156)
>         at 
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.cacheBytesRead(ZipArchiveInputStream.java:1086)
>         at 
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.readStoredEntry(ZipArchiveInputStream.java:1015)
>         at 
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.readStored(ZipArchiveInputStream.java:588)
>         at 
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:526){code}
> The stream seems to buffer the whole entry in memory until it finds the next 
> entry. Since the buffer is stored in an ByteArrayOutputStream only 
> Integer.MAX bytes can be buffered.
> Is this limitation intended? I have read 
> [https://commons.apache.org/proper/commons-compress/zip.html#ZipArchiveInputStream_vs_ZipFile]
>  but found nothing about file size limitation.
> Since the file is stored on disk I will switch to ZipFile but for other cases 
> it would be preferrable to be able to extract such files.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (COMPRESS-605) Failed to parse Non-zip64 signed apk with data descriptor

2022-02-06 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17487859#comment-17487859
 ] 

Peter Lee commented on COMPRESS-605:


Hi [~nicktheuncharted] 

 

I'm not familiar with the differences between apk's specification and zip's. Is 
this caused by the difference of specification?

 

And a PR is always welcome. :)

> Failed to parse Non-zip64 signed apk with data descriptor
> -
>
> Key: COMPRESS-605
> URL: https://issues.apache.org/jira/browse/COMPRESS-605
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Affects Versions: 1.21
>Reporter: nick allen
>Priority: Major
>
> I can't upload my apk due to security policy of my company, but I do find 
> where the problem lies.
> In 
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream#readDataDescriptor
>  we check whether following bytes are signaures to determine whethere size is 
> 8 bytes or 4 bytes. Because what following is apk signing block so it will 
> always thought "size" takes 8 bytes.
> So (4 + 4 = 8) extra bytes were read. Which leading to 
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream#isApkSigningBlock
>  also return false.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (COMPRESS-596) Mistake in Common Archival Logic code example

2021-12-06 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17454360#comment-17454360
 ] 

Peter Lee commented on COMPRESS-596:


Hi [~tamas.mucs] 

Thank you for your reporting.

 

No dought the "out" is not defined here. IMO it should be "o" instead of "out". 

Even through the "finish()" is called within "close()", we always explicit call 
the "finish()". We did this in our tests, and I want to make it be consistent 
here.

And I think "o" is reachable here. :)

> Mistake in Common Archival Logic code example
> -
>
> Key: COMPRESS-596
> URL: https://issues.apache.org/jira/browse/COMPRESS-596
> Project: Commons Compress
>  Issue Type: Bug
>Reporter: Tamas Mucs
>Priority: Minor
>  Labels: compress, docuentation
>
> There is an error on the documentation on page 
> [https://commons.apache.org/proper/commons-compress/examples.html|https://commons.apache.org/proper/commons-compress/examples.html.]
>   in section "Common Archival Logic "
> Currently the example code looks like this:
> {code:java}
> Collection filesToArchive = ...
> try (ArchiveOutputStream o = ... create the stream for your format ...) {
> for (File f : filesToArchive) {
> // maybe skip directories for formats like AR that don't store 
> directories
> ArchiveEntry entry = o.createArchiveEntry(f, entryName(f));
> // potentially add more flags to entry
> o.putArchiveEntry(entry);
> if (f.isFile()) {
> try (InputStream i = Files.newInputStream(f.toPath())) {
> IOUtils.copy(i, o);
> }
> }
> o.closeArchiveEntry();
> }
> out.finish();
> } {code}
> The variable "out" is not defined anywhere. Supposedly it is the variable "o" 
> of type ArchiveOutputStream. However it might be redundant to call "finish()" 
> since try-with-resources will implicitly call it via the close() method. Also 
> "o" is not reachable in that context.
>  
> Proposed solution: remove "out".



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (COMPRESS-589) 1.21 throws a 'java.io.IOException: Truncated TAR archive' exception while 1.20 not

2021-11-14 Thread Peter Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/COMPRESS-589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Lee resolved COMPRESS-589.

Resolution: Not A Problem

> 1.21 throws a 'java.io.IOException: Truncated TAR archive' exception while 
> 1.20 not
> ---
>
> Key: COMPRESS-589
> URL: https://issues.apache.org/jira/browse/COMPRESS-589
> Project: Commons Compress
>  Issue Type: Bug
>Affects Versions: 1.21
>Reporter: chen
>Priority: Major
>
> the bug happens when I use the TarArchiveInputStream to read bytes from the 
> current tar archive entry.
> first of all, we ran into this issue on an *{color:#ff}Android 
> device{color}*
> the trace shows as below
> {code:java}
> 08-27 14:39:18.657 10633 10963 W System.err: java.io.IOException: Truncated 
> TAR archive
> 08-27 14:39:18.657 10633 10963 W System.err: java.io.IOException: Truncated 
> TAR archive
> 08-27 14:39:18.657 10633 10963 W System.err: at 
> org.apache.commons.compress.archivers.tar.TarArchiveInputStream.getActuallySkipped(TarArchiveInputStream.java:478)
> 08-27 14:39:18.657 10633 10963 W System.err: at 
> org.apache.commons.compress.archivers.tar.TarArchiveInputStream.skipRecordPadding(TarArchiveInputStream.java:455)
> 08-27 14:39:18.657 10633 10963 W System.err: at 
> org.apache.commons.compress.archivers.tar.TarArchiveInputStream.getNextTarEntry(TarArchiveInputStream.java:367)
> {code}
> but when i downgrade to 1.20, the exception will not show again, so I think 
> it is a bug in the new version 
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (COMPRESS-595) IOUtils.readRange(ReadableByteChannel input, int len) reads more than len when input.read() returns < len (ie. there is a partial read)

2021-11-05 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17439574#comment-17439574
 ] 

Peter Lee commented on COMPRESS-595:


_This is currently not the case, because there is a call to rewind()_
_which results in a buffer whose remaining() is reset to `len` if_
_`readNow` < `len`._

I'm not quite clear about this part. Here is the current implemention of 
readRange :

 
{code:java}
public static byte[] readRange(final ReadableByteChannel input, final int len) 
throws IOException {
final ByteArrayOutputStream output = new ByteArrayOutputStream();
final ByteBuffer b = ByteBuffer.allocate(Math.min(len, COPY_BUF_SIZE));
int read = 0;
while (read < len) {
// Make sure we never read more than len bytes
b.limit(Math.min(len - read, b.capacity()));
final int readNow = input.read(b);
if (readNow <= 0) {
break;
}
output.write(b.array(), 0, readNow);
b.rewind();
read += readNow;
}
return output.toByteArray();
}
{code}
As you can see, there's only one call to _rewind()_ of _ByteBuffer b_. The 
_ByteBuffer b_ is a locale buffer which is used to copy the bytes from input to 
output. The remaining() of b is never called.

 

 

Besides that, recently we had a fix in readRange() for partical read in 
COMPRESS-584(Github PR 
[#214|[https://github.com/apache/commons-compress/pull/214/files])|https://github.com/apache/commons-compress/pull/214/files]).].
 Maybe you're testing with Compress 1.21?

> IOUtils.readRange(ReadableByteChannel input, int len) reads more than len 
> when input.read() returns < len (ie. there is a partial read)
> ---
>
> Key: COMPRESS-595
> URL: https://issues.apache.org/jira/browse/COMPRESS-595
> Project: Commons Compress
>  Issue Type: Bug
>Affects Versions: 1.21
>Reporter: NP
>Priority: Major
> Attachments: IOUtilsTest.kt
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> When `input.read(b)` returns `readNow` < `len`, then it means
>  `input.read(b)` will need to be called again with the same
>  buffer, whose `remaining()` is now the old `remaining()` - `readNow`.
> This way the ReadableByteChannel knows how many bytes are to be
>  read in subsequent iterations of the `while (read < len)` loop.
> This is currently not the case, because there is a call to rewind()
>  which results in a buffer whose remaining() is reset to `len` if
>  `readNow` < `len`.
> I suspect the readRange() method has only been used with channels that never 
> do partial reads (such In Memory Byte Channels backed by an array), and hence 
> the problem has not been experienced until now.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-595) IOUtils.readRange(ReadableByteChannel input, int len) reads more than len when input.read() returns < len (ie. there is a partial read)

2021-11-05 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17439572#comment-17439572
 ] 

Peter Lee commented on COMPRESS-595:


Hi [~nervy_protozoan]

Thank you for your reporting.

I made a test with your test case and found that the current implemention could 
be passed. I'm testing with the latest code. Maybe you missed something in your 
testcase?

> IOUtils.readRange(ReadableByteChannel input, int len) reads more than len 
> when input.read() returns < len (ie. there is a partial read)
> ---
>
> Key: COMPRESS-595
> URL: https://issues.apache.org/jira/browse/COMPRESS-595
> Project: Commons Compress
>  Issue Type: Bug
>Affects Versions: 1.21
>Reporter: NP
>Priority: Major
> Attachments: IOUtilsTest.kt
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> When `input.read(b)` returns `readNow` < `len`, then it means
>  `input.read(b)` will need to be called again with the same
>  buffer, whose `remaining()` is now the old `remaining()` - `readNow`.
> This way the ReadableByteChannel knows how many bytes are to be
>  read in subsequent iterations of the `while (read < len)` loop.
> This is currently not the case, because there is a call to rewind()
>  which results in a buffer whose remaining() is reset to `len` if
>  `readNow` < `len`.
> I suspect the readRange() method has only been used with channels that never 
> do partial reads (such In Memory Byte Channels backed by an array), and hence 
> the problem has not been experienced until now.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (COMPRESS-591) Fix decoding of 7z files containing LZMA streams with end marker

2021-10-29 Thread Peter Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/COMPRESS-591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Lee resolved COMPRESS-591.

Fix Version/s: 1.22
   Resolution: Fixed

> Fix decoding of 7z files containing LZMA streams with end marker
> 
>
> Key: COMPRESS-591
> URL: https://issues.apache.org/jira/browse/COMPRESS-591
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Affects Versions: 1.21
>Reporter: Lasse Collin
>Priority: Major
> Fix For: 1.22
>
> Attachments: lzma-with-eos.7z
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Some uncommon but valid .7z files contain LZMA streams that use the end of 
> stream marker. Currently Commons Compress together with XZ for Java considers 
> such files to be corrupt.
> XZ for Java 1.9 added a new method 
> [LZMAInputStream.enableRelaxedEndCondition()|https://tukaani.org/xz/xz-javadoc/org/tukaani/xz/LZMAInputStream.html#enableRelaxedEndCondition()]
>  specifically for this issue. To use this feature in Commons Compress, a 
> change is needed to the "decode" function in 
> src/main/java/org/apache/commons/compress/archivers/sevenz/LZMADecoder.java:
> {{- return new LZMAInputStream(in, uncompressedLength, propsByte, dictSize);}}
>  {{+ final LZMAInputStream lzmaIn = new LZMAInputStream(in, 
> uncompressedLength, propsByte, dictSize);}}
>  {{+ lzmaIn.enableRelaxedEndCondition();}}
>  {{+ return lzmaIn;}}
> A tiny test file is attached (thanks to Simon for providing it). Another test 
> file "sheet.7z" can be found from 
> <[https://sourceforge.net/p/lzmautils/discussion/708858/thread/822d80d5ea/]>.
> XZ for Java 1.9 is already a few months old, so I apologize for not reporting 
> this earlier.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (COMPRESS-592) Checksum verification failed reading 7z archive with more than 65536 entries

2021-10-29 Thread Peter Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/COMPRESS-592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Lee resolved COMPRESS-592.

Fix Version/s: 1.22
   Resolution: Fixed

> Checksum verification failed reading 7z archive with more than 65536 entries
> 
>
> Key: COMPRESS-592
> URL: https://issues.apache.org/jira/browse/COMPRESS-592
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Compressors
>Affects Versions: 1.21
> Environment: Compress 1.21 and XZ 1.9 on JDK 11; reproduced on both 
> Windows and Ubuntu Linux
>Reporter: Roland Kreuzer
>Priority: Major
> Fix For: 1.22
>
> Attachments: _DOC.7z
>
>
> I have a use-case where I have to decompress Sevenzip archives from an 
> external source which may have a large number of entries.
> I found decompression fails when trying to extract entry 65536 (zero-based 
> index) with a checksum failure.
>  
> I was able to reproduce the issue with a simple 7Zip file containing 70.001 
> entries with random MD5 checksum textfiles (attached).
> The sample Archive was created using the 7Zip Windows client and uses 
> LZMA2:3m.
>  
> My code is a simple sequential read of all contents of the file like
> {code:java}
> @Test
> void readBigSevenZipFile() throws IOException
> {
> try (SevenZFile sevenZFile = new SevenZFile(new 
> File("E:\\Temp\\_DOC.7z")))
> {
> SevenZArchiveEntry entry = sevenZFile.getNextEntry();
> while (entry != null)
> {
> if (entry.hasStream())
> {
> byte[] content = new byte[(int) entry.getSize()];
> sevenZFile.read(content);
> System.out.println(entry.getName());
> }
> entry = sevenZFile.getNextEntry();
> }
> }
> }
> {code}
> which fails consistently after file65535.txt with
> {code:java}
> java.io.IOException: Checksum verification failed
> at 
> org.apache.commons.compress.utils.ChecksumVerifyingInputStream.read(ChecksumVerifyingInputStream.java:94)
>  ~[commons-compress-1.21.jar!/:1.21]
> at 
> org.apache.commons.compress.archivers.sevenz.SevenZFile.read(SevenZFile.java:1905)
>  ~[commons-compress-1.21.jar!/:1.21]
> at 
> org.apache.commons.compress.archivers.sevenz.SevenZFile.read(SevenZFile.java:1888)
>  ~[commons-compress-1.21.jar!/:1.21]
> {code}
>  
> It is noticeable that the value is 2 to the 16th power, which could suggest 
> an overflow error of some sorts.
>  
> While the minimal sample contains only small txt files, I originally found 
> the issue with larger archives containing also Image and PDF files. The 
> archive's contents or size in byte does not seem to have direct influence on 
> the issue, only the number of files contained within.
>  
> I did not find any workaround yet.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-592) Checksum verification failed reading 7z archive with more than 65536 entries

2021-10-29 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17435890#comment-17435890
 ] 

Peter Lee commented on COMPRESS-592:


I pushed a fix. Please have a look. [~rolandkreuzer]

> Checksum verification failed reading 7z archive with more than 65536 entries
> 
>
> Key: COMPRESS-592
> URL: https://issues.apache.org/jira/browse/COMPRESS-592
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Compressors
>Affects Versions: 1.21
> Environment: Compress 1.21 and XZ 1.9 on JDK 11; reproduced on both 
> Windows and Ubuntu Linux
>Reporter: Roland Kreuzer
>Priority: Major
> Attachments: _DOC.7z
>
>
> I have a use-case where I have to decompress Sevenzip archives from an 
> external source which may have a large number of entries.
> I found decompression fails when trying to extract entry 65536 (zero-based 
> index) with a checksum failure.
>  
> I was able to reproduce the issue with a simple 7Zip file containing 70.001 
> entries with random MD5 checksum textfiles (attached).
> The sample Archive was created using the 7Zip Windows client and uses 
> LZMA2:3m.
>  
> My code is a simple sequential read of all contents of the file like
> {code:java}
> @Test
> void readBigSevenZipFile() throws IOException
> {
> try (SevenZFile sevenZFile = new SevenZFile(new 
> File("E:\\Temp\\_DOC.7z")))
> {
> SevenZArchiveEntry entry = sevenZFile.getNextEntry();
> while (entry != null)
> {
> if (entry.hasStream())
> {
> byte[] content = new byte[(int) entry.getSize()];
> sevenZFile.read(content);
> System.out.println(entry.getName());
> }
> entry = sevenZFile.getNextEntry();
> }
> }
> }
> {code}
> which fails consistently after file65535.txt with
> {code:java}
> java.io.IOException: Checksum verification failed
> at 
> org.apache.commons.compress.utils.ChecksumVerifyingInputStream.read(ChecksumVerifyingInputStream.java:94)
>  ~[commons-compress-1.21.jar!/:1.21]
> at 
> org.apache.commons.compress.archivers.sevenz.SevenZFile.read(SevenZFile.java:1905)
>  ~[commons-compress-1.21.jar!/:1.21]
> at 
> org.apache.commons.compress.archivers.sevenz.SevenZFile.read(SevenZFile.java:1888)
>  ~[commons-compress-1.21.jar!/:1.21]
> {code}
>  
> It is noticeable that the value is 2 to the 16th power, which could suggest 
> an overflow error of some sorts.
>  
> While the minimal sample contains only small txt files, I originally found 
> the issue with larger archives containing also Image and PDF files. The 
> archive's contents or size in byte does not seem to have direct influence on 
> the issue, only the number of files contained within.
>  
> I did not find any workaround yet.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-592) Checksum verification failed reading 7z archive with more than 65536 entries

2021-10-29 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17435881#comment-17435881
 ] 

Peter Lee commented on COMPRESS-592:


Hi [~rolandkreuzer]

Thank you for your reporting! I think I have located this problem. Will try to 
fix this soon.

> Checksum verification failed reading 7z archive with more than 65536 entries
> 
>
> Key: COMPRESS-592
> URL: https://issues.apache.org/jira/browse/COMPRESS-592
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Compressors
>Affects Versions: 1.21
> Environment: Compress 1.21 and XZ 1.9 on JDK 11; reproduced on both 
> Windows and Ubuntu Linux
>Reporter: Roland Kreuzer
>Priority: Major
> Attachments: _DOC.7z
>
>
> I have a use-case where I have to decompress Sevenzip archives from an 
> external source which may have a large number of entries.
> I found decompression fails when trying to extract entry 65536 (zero-based 
> index) with a checksum failure.
>  
> I was able to reproduce the issue with a simple 7Zip file containing 70.001 
> entries with random MD5 checksum textfiles (attached).
> The sample Archive was created using the 7Zip Windows client and uses 
> LZMA2:3m.
>  
> My code is a simple sequential read of all contents of the file like
> {code:java}
> @Test
> void readBigSevenZipFile() throws IOException
> {
> try (SevenZFile sevenZFile = new SevenZFile(new 
> File("E:\\Temp\\_DOC.7z")))
> {
> SevenZArchiveEntry entry = sevenZFile.getNextEntry();
> while (entry != null)
> {
> if (entry.hasStream())
> {
> byte[] content = new byte[(int) entry.getSize()];
> sevenZFile.read(content);
> System.out.println(entry.getName());
> }
> entry = sevenZFile.getNextEntry();
> }
> }
> }
> {code}
> which fails consistently after file65535.txt with
> {code:java}
> java.io.IOException: Checksum verification failed
> at 
> org.apache.commons.compress.utils.ChecksumVerifyingInputStream.read(ChecksumVerifyingInputStream.java:94)
>  ~[commons-compress-1.21.jar!/:1.21]
> at 
> org.apache.commons.compress.archivers.sevenz.SevenZFile.read(SevenZFile.java:1905)
>  ~[commons-compress-1.21.jar!/:1.21]
> at 
> org.apache.commons.compress.archivers.sevenz.SevenZFile.read(SevenZFile.java:1888)
>  ~[commons-compress-1.21.jar!/:1.21]
> {code}
>  
> It is noticeable that the value is 2 to the 16th power, which could suggest 
> an overflow error of some sorts.
>  
> While the minimal sample contains only small txt files, I originally found 
> the issue with larger archives containing also Image and PDF files. The 
> archive's contents or size in byte does not seem to have direct influence on 
> the issue, only the number of files contained within.
>  
> I did not find any workaround yet.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-589) 1.21 throws a 'java.io.IOException: Truncated TAR archive' exception while 1.20 not

2021-10-26 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17434340#comment-17434340
 ] 

Peter Lee commented on COMPRESS-589:


Hi [~wilx]

_Does this mean any instance of {{TarArchiveEntry}} is tied to the originating 
stream and cannot be reused elsewhere?_

 

For TarArchiveInputStream, the entry is acquired by getNextEntry or 
getNextTarEntry, and IIRC it's reused in the following reading procedure. So we 
can say it is somehow 'tied' with the TarArchiveInputStream.

Anyway, IMO it's not a good practice to reuse the TarArchiveEntry, especially 
if something has changed.

> 1.21 throws a 'java.io.IOException: Truncated TAR archive' exception while 
> 1.20 not
> ---
>
> Key: COMPRESS-589
> URL: https://issues.apache.org/jira/browse/COMPRESS-589
> Project: Commons Compress
>  Issue Type: Bug
>Affects Versions: 1.21
>Reporter: chen
>Priority: Major
>
> the bug happens when I use the TarArchiveInputStream to read bytes from the 
> current tar archive entry.
> first of all, we ran into this issue on an *{color:#ff}Android 
> device{color}*
> the trace shows as below
> {code:java}
> 08-27 14:39:18.657 10633 10963 W System.err: java.io.IOException: Truncated 
> TAR archive
> 08-27 14:39:18.657 10633 10963 W System.err: java.io.IOException: Truncated 
> TAR archive
> 08-27 14:39:18.657 10633 10963 W System.err: at 
> org.apache.commons.compress.archivers.tar.TarArchiveInputStream.getActuallySkipped(TarArchiveInputStream.java:478)
> 08-27 14:39:18.657 10633 10963 W System.err: at 
> org.apache.commons.compress.archivers.tar.TarArchiveInputStream.skipRecordPadding(TarArchiveInputStream.java:455)
> 08-27 14:39:18.657 10633 10963 W System.err: at 
> org.apache.commons.compress.archivers.tar.TarArchiveInputStream.getNextTarEntry(TarArchiveInputStream.java:367)
> {code}
> but when i downgrade to 1.20, the exception will not show again, so I think 
> it is a bug in the new version 
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-589) 1.21 throws a 'java.io.IOException: Truncated TAR archive' exception while 1.20 not

2021-10-26 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17434333#comment-17434333
 ] 

Peter Lee commented on COMPRESS-589:


Hi [~chenNotFound]

_On Android devices, if the tar package size is greater than 2G, 
inputStream.available() in the skipRecordPadding() method will return 0_
_On Windows, it will return 2147483647 (Integer.MAX_VALUE)_

 

I think this might be a cause of your problem. And I think it's not hard to 
test if it's or not : just change the inputStream(maybe it's a 
FileInputStream?) you passed to TarArchiveInputStream to another input 
stream(maybe a ByteArrayInputStream or something else) to bypass the difference 
of available().

> 1.21 throws a 'java.io.IOException: Truncated TAR archive' exception while 
> 1.20 not
> ---
>
> Key: COMPRESS-589
> URL: https://issues.apache.org/jira/browse/COMPRESS-589
> Project: Commons Compress
>  Issue Type: Bug
>Affects Versions: 1.21
>Reporter: chen
>Priority: Major
>
> the bug happens when I use the TarArchiveInputStream to read bytes from the 
> current tar archive entry.
> first of all, we ran into this issue on an *{color:#ff}Android 
> device{color}*
> the trace shows as below
> {code:java}
> 08-27 14:39:18.657 10633 10963 W System.err: java.io.IOException: Truncated 
> TAR archive
> 08-27 14:39:18.657 10633 10963 W System.err: java.io.IOException: Truncated 
> TAR archive
> 08-27 14:39:18.657 10633 10963 W System.err: at 
> org.apache.commons.compress.archivers.tar.TarArchiveInputStream.getActuallySkipped(TarArchiveInputStream.java:478)
> 08-27 14:39:18.657 10633 10963 W System.err: at 
> org.apache.commons.compress.archivers.tar.TarArchiveInputStream.skipRecordPadding(TarArchiveInputStream.java:455)
> 08-27 14:39:18.657 10633 10963 W System.err: at 
> org.apache.commons.compress.archivers.tar.TarArchiveInputStream.getNextTarEntry(TarArchiveInputStream.java:367)
> {code}
> but when i downgrade to 1.20, the exception will not show again, so I think 
> it is a bug in the new version 
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (COMPRESS-591) Fix decoding of 7z files containing LZMA streams with end marker

2021-09-25 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17419545#comment-17419545
 ] 

Peter Lee edited comment on COMPRESS-591 at 9/26/21, 1:33 AM:
--

_The PR changes also compressors/lzma/LZMACompressorInputStream.java. This is 
wrong and will cause problems._

I see. I will fix this.

 

Thank you again for your reporting and patch.


was (Author: peterlee):
_The PR changes also compressors/lzma/LZMACompressorInputStream.java. This is 
wrong and will cause problems._

I see. I will fix this.

 

Thank you again for your reporting and path.

> Fix decoding of 7z files containing LZMA streams with end marker
> 
>
> Key: COMPRESS-591
> URL: https://issues.apache.org/jira/browse/COMPRESS-591
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Affects Versions: 1.21
>Reporter: Lasse Collin
>Priority: Major
> Attachments: lzma-with-eos.7z
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Some uncommon but valid .7z files contain LZMA streams that use the end of 
> stream marker. Currently Commons Compress together with XZ for Java considers 
> such files to be corrupt.
> XZ for Java 1.9 added a new method 
> [LZMAInputStream.enableRelaxedEndCondition()|https://tukaani.org/xz/xz-javadoc/org/tukaani/xz/LZMAInputStream.html#enableRelaxedEndCondition()]
>  specifically for this issue. To use this feature in Commons Compress, a 
> change is needed to the "decode" function in 
> src/main/java/org/apache/commons/compress/archivers/sevenz/LZMADecoder.java:
> {{- return new LZMAInputStream(in, uncompressedLength, propsByte, dictSize);}}
>  {{+ final LZMAInputStream lzmaIn = new LZMAInputStream(in, 
> uncompressedLength, propsByte, dictSize);}}
>  {{+ lzmaIn.enableRelaxedEndCondition();}}
>  {{+ return lzmaIn;}}
> A tiny test file is attached (thanks to Simon for providing it). Another test 
> file "sheet.7z" can be found from 
> <[https://sourceforge.net/p/lzmautils/discussion/708858/thread/822d80d5ea/]>.
> XZ for Java 1.9 is already a few months old, so I apologize for not reporting 
> this earlier.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-591) Fix decoding of 7z files containing LZMA streams with end marker

2021-09-23 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17419545#comment-17419545
 ] 

Peter Lee commented on COMPRESS-591:


_The PR changes also compressors/lzma/LZMACompressorInputStream.java. This is 
wrong and will cause problems._

I see. I will fix this.

 

Thank you again for your reporting and path.

> Fix decoding of 7z files containing LZMA streams with end marker
> 
>
> Key: COMPRESS-591
> URL: https://issues.apache.org/jira/browse/COMPRESS-591
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Affects Versions: 1.21
>Reporter: Lasse Collin
>Priority: Major
> Attachments: lzma-with-eos.7z
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Some uncommon but valid .7z files contain LZMA streams that use the end of 
> stream marker. Currently Commons Compress together with XZ for Java considers 
> such files to be corrupt.
> XZ for Java 1.9 added a new method 
> [LZMAInputStream.enableRelaxedEndCondition()|https://tukaani.org/xz/xz-javadoc/org/tukaani/xz/LZMAInputStream.html#enableRelaxedEndCondition()]
>  specifically for this issue. To use this feature in Commons Compress, a 
> change is needed to the "decode" function in 
> src/main/java/org/apache/commons/compress/archivers/sevenz/LZMADecoder.java:
> {{- return new LZMAInputStream(in, uncompressedLength, propsByte, dictSize);}}
>  {{+ final LZMAInputStream lzmaIn = new LZMAInputStream(in, 
> uncompressedLength, propsByte, dictSize);}}
>  {{+ lzmaIn.enableRelaxedEndCondition();}}
>  {{+ return lzmaIn;}}
> A tiny test file is attached (thanks to Simon for providing it). Another test 
> file "sheet.7z" can be found from 
> <[https://sourceforge.net/p/lzmautils/discussion/708858/thread/822d80d5ea/]>.
> XZ for Java 1.9 is already a few months old, so I apologize for not reporting 
> this earlier.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-591) Fix decoding of 7z files containing LZMA streams with end marker

2021-09-22 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17418470#comment-17418470
 ] 

Peter Lee commented on COMPRESS-591:


Hi [~larhzu], please have a look at 
[this|https://github.com/apache/commons-compress/pull/223]

> Fix decoding of 7z files containing LZMA streams with end marker
> 
>
> Key: COMPRESS-591
> URL: https://issues.apache.org/jira/browse/COMPRESS-591
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Affects Versions: 1.21
>Reporter: Lasse Collin
>Priority: Major
> Attachments: lzma-with-eos.7z
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Some uncommon but valid .7z files contain LZMA streams that use the end of 
> stream marker. Currently Commons Compress together with XZ for Java considers 
> such files to be corrupt.
> XZ for Java 1.9 added a new method 
> [LZMAInputStream.enableRelaxedEndCondition()|https://tukaani.org/xz/xz-javadoc/org/tukaani/xz/LZMAInputStream.html#enableRelaxedEndCondition()]
>  specifically for this issue. To use this feature in Commons Compress, a 
> change is needed to the "decode" function in 
> src/main/java/org/apache/commons/compress/archivers/sevenz/LZMADecoder.java:
> {{- return new LZMAInputStream(in, uncompressedLength, propsByte, dictSize);}}
>  {{+ final LZMAInputStream lzmaIn = new LZMAInputStream(in, 
> uncompressedLength, propsByte, dictSize);}}
>  {{+ lzmaIn.enableRelaxedEndCondition();}}
>  {{+ return lzmaIn;}}
> A tiny test file is attached (thanks to Simon for providing it). Another test 
> file "sheet.7z" can be found from 
> <[https://sourceforge.net/p/lzmautils/discussion/708858/thread/822d80d5ea/]>.
> XZ for Java 1.9 is already a few months old, so I apologize for not reporting 
> this earlier.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-591) Fix decoding of 7z files containing LZMA streams with end marker

2021-09-21 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17418447#comment-17418447
 ] 

Peter Lee commented on COMPRESS-591:


Sorry about my late replay, and thank you for your detailed explanization.

I will create a PR on github for this. Please have a look.

> Fix decoding of 7z files containing LZMA streams with end marker
> 
>
> Key: COMPRESS-591
> URL: https://issues.apache.org/jira/browse/COMPRESS-591
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Affects Versions: 1.21
>Reporter: Lasse Collin
>Priority: Major
> Attachments: lzma-with-eos.7z
>
>
> Some uncommon but valid .7z files contain LZMA streams that use the end of 
> stream marker. Currently Commons Compress together with XZ for Java considers 
> such files to be corrupt.
> XZ for Java 1.9 added a new method 
> [LZMAInputStream.enableRelaxedEndCondition()|https://tukaani.org/xz/xz-javadoc/org/tukaani/xz/LZMAInputStream.html#enableRelaxedEndCondition()]
>  specifically for this issue. To use this feature in Commons Compress, a 
> change is needed to the "decode" function in 
> src/main/java/org/apache/commons/compress/archivers/sevenz/LZMADecoder.java:
> {{- return new LZMAInputStream(in, uncompressedLength, propsByte, dictSize);}}
>  {{+ final LZMAInputStream lzmaIn = new LZMAInputStream(in, 
> uncompressedLength, propsByte, dictSize);}}
>  {{+ lzmaIn.enableRelaxedEndCondition();}}
>  {{+ return lzmaIn;}}
> A tiny test file is attached (thanks to Simon for providing it). Another test 
> file "sheet.7z" can be found from 
> <[https://sourceforge.net/p/lzmautils/discussion/708858/thread/822d80d5ea/]>.
> XZ for Java 1.9 is already a few months old, so I apologize for not reporting 
> this earlier.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (COMPRESS-585) ZipFile fails to read a zipfile with a comment or extra data longer than 8024 bytes

2021-09-13 Thread Peter Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/COMPRESS-585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Lee resolved COMPRESS-585.

Resolution: Fixed

> ZipFile fails to read a zipfile with a comment or extra data longer than 8024 
> bytes
> ---
>
> Key: COMPRESS-585
> URL: https://issues.apache.org/jira/browse/COMPRESS-585
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Affects Versions: 1.21
>Reporter: Matthijs Laan
>Priority: Minor
> Attachments: COMPRESS-585-test.patch
>
>
> See the attached patch for a unit test demonstrating the issue with a long 
> comment.
> The cause is that {{ZipFile.readCentralDirectoryEntry()}} calls 
> {{IOUtils.readRange()}} and assumes if it returns less than the length it 
> asked for that the EOF is reached, however this is not how that method works: 
> it returns max COPY_BUF_SIZE bytes (8024), even if EOF has not been reached.
> Besides comments and extra data (in the central directory or local file 
> header) longer than 8024 bytes the only other place {{readRange()}} is called 
> is reading filenames, but that seems like a remote edge case and an EOF 
> exception is fine.
> The IOUtils.readRange() JavaDoc does not specify how it communicates EOF. 
> With a blocking channel this would be when it returns a zero length array. It 
> could throw an exception when {{Channel.read()}} returns 0 bytes, because 
> that only happens on non-blocking channels.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (COMPRESS-584) IOUtils.readRange() can read more from a channel than asked for

2021-09-13 Thread Peter Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/COMPRESS-584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Lee resolved COMPRESS-584.

Resolution: Fixed

> IOUtils.readRange() can read more from a channel than asked for
> ---
>
> Key: COMPRESS-584
> URL: https://issues.apache.org/jira/browse/COMPRESS-584
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Affects Versions: 1.21
>Reporter: Matthijs Laan
>Assignee: Peter Lee
>Priority: Major
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> When {{IOUtils.readRange(ReadableByteChannel,int)}} gets less than the number 
> of bytes asked for in the first read call it does not reduce the buffer size 
> for the next read call and may read more than asked for.
> This situation is rare when using a {{FileChannel}} but I wrote a 
> {{SeekableByteChannel}} backed by a URI using HTTP range requests and reading 
> from a socket can often return less bytes than asked for. When I used this 
> channel to read a {{ZipFile}} it only read the ZIP central directory 
> partially sometimes because {{IOUtils.readRange()}} called from 
> {{ZipFile.readCentralDirectoryEntry()}} read more bytes than asked for and it 
> stopped parsing directory entries.
> Fix: [https://github.com/apache/commons-compress/pull/214]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-587) 1.21 throws IllegalArgumentException in putArchiveEntry

2021-09-13 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17414154#comment-17414154
 ] 

Peter Lee commented on COMPRESS-587:


Hi [~kamali.ali]

Thank you for your reporting.

Could you provide a detailed test to reproduce this problem?

> 1.21 throws IllegalArgumentException in putArchiveEntry
> ---
>
> Key: COMPRESS-587
> URL: https://issues.apache.org/jira/browse/COMPRESS-587
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Compressors
>Affects Versions: 1.21
>Reporter: Ali Kamali
>Priority: Major
>
> Code stopped working after the upgrade to 1.21 with the following stacktrace:
> {noformat}
> java.lang.IllegalArgumentException: group id '673186305' is too big ( > 
> 2097151 ). Use STAR or POSIX extensions to overcome this limit
>   
>  
>at 
> org.apache.commons.compress.archivers.tar.TarArchiveOutputStream.failForBigNumber(TarArchiveOutputStream.java:651)
>   
>   
>   
>at 
> org.apache.commons.compress.archivers.tar.TarArchiveOutputStream.failForBigNumberWithPosixMessage(TarArchiveOutputStream.java:644)
>   
>   
>   
>at 
> org.apache.commons.compress.archivers.tar.TarArchiveOutputStream.failForBigNumbers(TarArchiveOutputStream.java:626)
>   
>   
>  
>at 
> org.apache.commons.compress.archivers.tar.TarArchiveOutputStream.putArchiveEntry(TarArchiveOutputStream.java:377)
>  {noformat}
> Code looks like this:
> {noformat}
> val tarOutput = new TarArchiveOutputStream(new BufferedOutputStream(new 
> FileOutputStream(outputTarFile)))
> val entry = new TarArchiveEntry(file, fileName)
> tarOutput.putArchiveEntry(archiveEntry)
> {noformat}
> I've traced the issue to this change 
> [https://github.com/apache/commons-compress/commit/afaaacf8ce5ffd0735c4b5e70259068327741ab0]
> In 1.20 no values where set for userId and groupId (both were 0), with 1.21 I 
> now actually get uid and gid populated and they are both bigger than 2097151.
>  
> As a workaround I'll be using STAR or POSIX extensions, but still reporting 
> this as a bug since I wasn't expecting a break in a minor version change. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-591) Fix decoding of 7z files containing LZMA streams with end marker

2021-09-13 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17414152#comment-17414152
 ] 

Peter Lee commented on COMPRESS-591:


Hi [~larhzu]

Thank you for your reporting.

I noticed that there are some `downsides` to enable this. I would prefer to 
introduce a new option to enable this, but not to enable this by default.

BTW, a PR in github is always welcome.

> Fix decoding of 7z files containing LZMA streams with end marker
> 
>
> Key: COMPRESS-591
> URL: https://issues.apache.org/jira/browse/COMPRESS-591
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Affects Versions: 1.21
>Reporter: Lasse Collin
>Priority: Major
> Attachments: lzma-with-eos.7z
>
>
> Some uncommon but valid .7z files contain LZMA streams that use the end of 
> stream marker. Currently Commons Compress together with XZ for Java considers 
> such files to be corrupt.
> XZ for Java 1.9 added a new method 
> [LZMAInputStream.enableRelaxedEndCondition()|https://tukaani.org/xz/xz-javadoc/org/tukaani/xz/LZMAInputStream.html#enableRelaxedEndCondition()]
>  specifically for this issue. To use this feature in Commons Compress, a 
> change is needed to the "decode" function in 
> src/main/java/org/apache/commons/compress/archivers/sevenz/LZMADecoder.java:
> {{- return new LZMAInputStream(in, uncompressedLength, propsByte, dictSize);}}
>  {{+ final LZMAInputStream lzmaIn = new LZMAInputStream(in, 
> uncompressedLength, propsByte, dictSize);}}
>  {{+ lzmaIn.enableRelaxedEndCondition();}}
>  {{+ return lzmaIn;}}
> A tiny test file is attached (thanks to Simon for providing it). Another test 
> file "sheet.7z" can be found from 
> <[https://sourceforge.net/p/lzmautils/discussion/708858/thread/822d80d5ea/]>.
> XZ for Java 1.9 is already a few months old, so I apologize for not reporting 
> this earlier.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (COMPRESS-584) IOUtils.readRange() can read more from a channel than asked for

2021-08-07 Thread Peter Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/COMPRESS-584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Lee updated COMPRESS-584:
---
Assignee: Peter Lee

> IOUtils.readRange() can read more from a channel than asked for
> ---
>
> Key: COMPRESS-584
> URL: https://issues.apache.org/jira/browse/COMPRESS-584
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Affects Versions: 1.21
>Reporter: Matthijs Laan
>Assignee: Peter Lee
>Priority: Major
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> When {{IOUtils.readRange(ReadableByteChannel,int)}} gets less than the number 
> of bytes asked for in the first read call it does not reduce the buffer size 
> for the next read call and may read more than asked for.
> This situation is rare when using a {{FileChannel}} but I wrote a 
> {{SeekableByteChannel}} backed by a URI using HTTP range requests and reading 
> from a socket can often return less bytes than asked for. When I used this 
> channel to read a {{ZipFile}} it only read the ZIP central directory 
> partially sometimes because {{IOUtils.readRange()}} called from 
> {{ZipFile.readCentralDirectoryEntry()}} read more bytes than asked for and it 
> stopped parsing directory entries.
> Fix: [https://github.com/apache/commons-compress/pull/214]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-585) ZipFile fails to read a zipfile with a comment or extra data longer than 8024 bytes

2021-08-04 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17392876#comment-17392876
 ] 

Peter Lee commented on COMPRESS-585:


_> The cause is that {{ZipFile.readCentralDirectoryEntry()}} calls 
{{IOUtils.readRange()}} and assumes if it returns less than the length it asked 
for that the EOF is reached, however this is not how that method works: it 
returns max COPY_BUF_SIZE bytes (8024), even if EOF has not been reached._

 

Can this problem be reproduced?

The _COPY_BUF_SIZE_ is just the limit of buffer' size. With _rewind_ called, 
the buffer can be reused to copy more than _COPY_BUF_SIZE_  bytes of data.

> ZipFile fails to read a zipfile with a comment or extra data longer than 8024 
> bytes
> ---
>
> Key: COMPRESS-585
> URL: https://issues.apache.org/jira/browse/COMPRESS-585
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Affects Versions: 1.21
>Reporter: Matthijs Laan
>Priority: Minor
> Attachments: COMPRESS-585-test.patch
>
>
> See the attached patch for a unit test demonstrating the issue with a long 
> comment.
> The cause is that {{ZipFile.readCentralDirectoryEntry()}} calls 
> {{IOUtils.readRange()}} and assumes if it returns less than the length it 
> asked for that the EOF is reached, however this is not how that method works: 
> it returns max COPY_BUF_SIZE bytes (8024), even if EOF has not been reached.
> Besides comments and extra data (in the central directory or local file 
> header) longer than 8024 bytes the only other place {{readRange()}} is called 
> is reading filenames, but that seems like a remote edge case and an EOF 
> exception is fine.
> The IOUtils.readRange() JavaDoc does not specify how it communicates EOF. 
> With a blocking channel this would be when it returns a zero length array. It 
> could throw an exception when {{Channel.read()}} returns 0 bytes, because 
> that only happens on non-blocking channels.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-583) 1.21 generates different output binaries compared to older versions as well as on different OSes

2021-08-01 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17391366#comment-17391366
 ] 

Peter Lee commented on COMPRESS-583:


The website is updated. Please have a look. [~francium25]

> 1.21 generates different output binaries compared to older versions as well 
> as on different OSes
> 
>
> Key: COMPRESS-583
> URL: https://issues.apache.org/jira/browse/COMPRESS-583
> Project: Commons Compress
>  Issue Type: Bug
>Affects Versions: 1.21
>Reporter: Chanseok Oh
>Priority: Major
>
> Upgrading {{commons-compress}} had always been generating the same compressed 
> output byte-to-byte for the same input (i.e., their SHA checksum didn't 
> change between versions). However, starting with 1.21, we noticed it's 
> generating different output than what previous versions are generating.
> We also noticed that the same code generates different binaries on different 
> OSes. For example, 1.21 on Linux is different from 1.21 on Mac.
> However, at least on the same OS, 1.21 seems to reproducibly generate the 
> same output.
> See the context at [https://github.com/GoogleContainerTools/jib/pull/3342]
> 
> *UPDATE*: running diffoscope reveals that 1.21 is picking up the user and 
> group of a local environment.
> (output below manually reformatted slightly for readability)
> {{$ diffoscope 
> 6d2763b0f3940d324ea6b55386429e5b173899608abf7d1bff62e25dd2e4dcea.tar.gz 
> 32258c626498c13412679442e3417811bc7ab801c6928da2c2a97e0bbc380a88.tar.gz}}
> {{--- 
> 6d2763b0f3940d324ea6b55386429e5b173899608abf7d1bff62e25dd2e4dcea.tar.gz}}
> {{+++ 
> 32258c626498c13412679442e3417811bc7ab801c6928da2c2a97e0bbc380a88.tar.gz}}
> {{│ --- 6d2763b0f3940d324ea6b55386429e5b173899608abf7d1bff62e25dd2e4dcea.tar}}
> {{├── +++ 
> 32258c626498c13412679442e3417811bc7ab801c6928da2c2a97e0bbc380a88.tar}}
> {{│ ├── file list}}
> {{│ │ @@ -1,3 +1,3 @@}}
> {{│ │ {color:#de350b}-drwxr-xr-x 0                 0          0 0 1970-01-01 
> 00:00:01.00 app/{color}}}
> {{│ │ {color:#00875a}+drwxr-xr-x 0 chanseok (252384) eng (5000) 0 1970-01-01 
> 00:00:01.00 app/{color}}}
> {{│ │ -rw-r--r--  0                 0          0 0 1970-01-01 00:00:01.00 
> app/fileB.txt}}
> {{│ │ -rw-r--r--  0                 0          0 0 1970-01-01 00:00:01.00 
> app/fileC.txt}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (COMPRESS-583) 1.21 generates different output binaries compared to older versions as well as on different OSes

2021-07-30 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17390837#comment-17390837
 ] 

Peter Lee edited comment on COMPRESS-583 at 7/31/21, 6:50 AM:
--

I pushed a commit 
[d57b8e9|https://github.com/apache/commons-compress/commit/d57b8e9068bc0184b783aa488e613f4fcf2140d6]
 to add this in changelog. Please have a look.

Now the changelog of COMPRESS-404 is :

_Update the class of variable file in TarArchiveEntry from_
 _java.io.File to java.nio.file.Path. Corresponding constructors_
 _and methods are also modified/added._
 _NOTE: The userName, groupName, userID and groupID will also be_
 _set if they are available. The userName and groupName was not_
 _set previously, and the previous value of UserID:GroupID was_
 _0:0 by default._
 _Please note this may cause a reproducibility problem._

_Github Pull Request #97._


was (Author: peterlee):
I pushed a commit 
[d57b8e9|https://github.com/apache/commons-compress/commit/d57b8e9068bc0184b783aa488e613f4fcf2140d6]
 to add this in changelog. Please have a look.

Now the changelog of COMPRESS-404 is :
 _Update the class of variable file in TarArchiveEntry from_
 _java.io.File to java.nio.file.Path. Corresponding constructors_
 _and methods are also modified/added._

_NOTE: The UserID and GroupID will also be read if they are_
 _available. The previous default value UserID:GroupdID of was 0:0._
 _This may cause a reproducibility problem._
 _Github Pull Request #97._

> 1.21 generates different output binaries compared to older versions as well 
> as on different OSes
> 
>
> Key: COMPRESS-583
> URL: https://issues.apache.org/jira/browse/COMPRESS-583
> Project: Commons Compress
>  Issue Type: Bug
>Affects Versions: 1.21
>Reporter: Chanseok Oh
>Priority: Major
>
> Upgrading {{commons-compress}} had always been generating the same compressed 
> output byte-to-byte for the same input (i.e., their SHA checksum didn't 
> change between versions). However, starting with 1.21, we noticed it's 
> generating different output than what previous versions are generating.
> We also noticed that the same code generates different binaries on different 
> OSes. For example, 1.21 on Linux is different from 1.21 on Mac.
> However, at least on the same OS, 1.21 seems to reproducibly generate the 
> same output.
> See the context at [https://github.com/GoogleContainerTools/jib/pull/3342]
> 
> *UPDATE*: running diffoscope reveals that 1.21 is picking up the user and 
> group of a local environment.
> (output below manually reformatted slightly for readability)
> {{$ diffoscope 
> 6d2763b0f3940d324ea6b55386429e5b173899608abf7d1bff62e25dd2e4dcea.tar.gz 
> 32258c626498c13412679442e3417811bc7ab801c6928da2c2a97e0bbc380a88.tar.gz}}
> {{--- 
> 6d2763b0f3940d324ea6b55386429e5b173899608abf7d1bff62e25dd2e4dcea.tar.gz}}
> {{+++ 
> 32258c626498c13412679442e3417811bc7ab801c6928da2c2a97e0bbc380a88.tar.gz}}
> {{│ --- 6d2763b0f3940d324ea6b55386429e5b173899608abf7d1bff62e25dd2e4dcea.tar}}
> {{├── +++ 
> 32258c626498c13412679442e3417811bc7ab801c6928da2c2a97e0bbc380a88.tar}}
> {{│ ├── file list}}
> {{│ │ @@ -1,3 +1,3 @@}}
> {{│ │ {color:#de350b}-drwxr-xr-x 0                 0          0 0 1970-01-01 
> 00:00:01.00 app/{color}}}
> {{│ │ {color:#00875a}+drwxr-xr-x 0 chanseok (252384) eng (5000) 0 1970-01-01 
> 00:00:01.00 app/{color}}}
> {{│ │ -rw-r--r--  0                 0          0 0 1970-01-01 00:00:01.00 
> app/fileB.txt}}
> {{│ │ -rw-r--r--  0                 0          0 0 1970-01-01 00:00:01.00 
> app/fileC.txt}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (COMPRESS-583) 1.21 generates different output binaries compared to older versions as well as on different OSes

2021-07-30 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17390837#comment-17390837
 ] 

Peter Lee edited comment on COMPRESS-583 at 7/31/21, 2:00 AM:
--

I pushed a commit 
[d57b8e9|https://github.com/apache/commons-compress/commit/d57b8e9068bc0184b783aa488e613f4fcf2140d6]
 to add this in changelog. Please have a look.

Now the changelog of COMPRESS-404 is :
 _Update the class of variable file in TarArchiveEntry from_
 _java.io.File to java.nio.file.Path. Corresponding constructors_
 _and methods are also modified/added._

_NOTE: The UserID and GroupID will also be read if they are_
 _available. The previous default value UserID:GroupdID of was 0:0._
 _This may cause a reproducibility problem._
 _Github Pull Request #97._


was (Author: peterlee):
I pushed a commit 
[7bc8679|https://github.com/apache/commons-compress/commit/7bc86793d2533ef314550efe2b79df752abdc8d4]
 to add this in changelog. Please have a look.

Now the changelog of COMPRESS-404 is :
 _Update the class of variable file in TarArchiveEntry from_
 _java.io.File to java.nio.file.Path. Corresponding constructors_
 _and methods are also modified/added._

_NOTE: The UserID and GroupID will also be read if they are_
 _available. The previous default value UserID:GroupdID of was 0:0._
 _This may cause a reproducibility problem._
 _Github Pull Request #97._

> 1.21 generates different output binaries compared to older versions as well 
> as on different OSes
> 
>
> Key: COMPRESS-583
> URL: https://issues.apache.org/jira/browse/COMPRESS-583
> Project: Commons Compress
>  Issue Type: Bug
>Affects Versions: 1.21
>Reporter: Chanseok Oh
>Priority: Major
>
> Upgrading {{commons-compress}} had always been generating the same compressed 
> output byte-to-byte for the same input (i.e., their SHA checksum didn't 
> change between versions). However, starting with 1.21, we noticed it's 
> generating different output than what previous versions are generating.
> We also noticed that the same code generates different binaries on different 
> OSes. For example, 1.21 on Linux is different from 1.21 on Mac.
> However, at least on the same OS, 1.21 seems to reproducibly generate the 
> same output.
> See the context at [https://github.com/GoogleContainerTools/jib/pull/3342]
> 
> *UPDATE*: running diffoscope reveals that 1.21 is picking up the user and 
> group of a local environment.
> (output below manually reformatted slightly for readability)
> {{$ diffoscope 
> 6d2763b0f3940d324ea6b55386429e5b173899608abf7d1bff62e25dd2e4dcea.tar.gz 
> 32258c626498c13412679442e3417811bc7ab801c6928da2c2a97e0bbc380a88.tar.gz}}
> {{--- 
> 6d2763b0f3940d324ea6b55386429e5b173899608abf7d1bff62e25dd2e4dcea.tar.gz}}
> {{+++ 
> 32258c626498c13412679442e3417811bc7ab801c6928da2c2a97e0bbc380a88.tar.gz}}
> {{│ --- 6d2763b0f3940d324ea6b55386429e5b173899608abf7d1bff62e25dd2e4dcea.tar}}
> {{├── +++ 
> 32258c626498c13412679442e3417811bc7ab801c6928da2c2a97e0bbc380a88.tar}}
> {{│ ├── file list}}
> {{│ │ @@ -1,3 +1,3 @@}}
> {{│ │ {color:#de350b}-drwxr-xr-x 0                 0          0 0 1970-01-01 
> 00:00:01.00 app/{color}}}
> {{│ │ {color:#00875a}+drwxr-xr-x 0 chanseok (252384) eng (5000) 0 1970-01-01 
> 00:00:01.00 app/{color}}}
> {{│ │ -rw-r--r--  0                 0          0 0 1970-01-01 00:00:01.00 
> app/fileB.txt}}
> {{│ │ -rw-r--r--  0                 0          0 0 1970-01-01 00:00:01.00 
> app/fileC.txt}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-583) 1.21 generates different output binaries compared to older versions as well as on different OSes

2021-07-30 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17390837#comment-17390837
 ] 

Peter Lee commented on COMPRESS-583:


I pushed a commit 
[7bc8679|https://github.com/apache/commons-compress/commit/7bc86793d2533ef314550efe2b79df752abdc8d4]
 to add this in changelog. Please have a look.

Now the changelog of COMPRESS-404 is :
 _Update the class of variable file in TarArchiveEntry from_
 _java.io.File to java.nio.file.Path. Corresponding constructors_
 _and methods are also modified/added._

_NOTE: The UserID and GroupID will also be read if they are_
 _available. The previous default value UserID:GroupdID of was 0:0._
 _This may cause a reproducibility problem._
 _Github Pull Request #97._

> 1.21 generates different output binaries compared to older versions as well 
> as on different OSes
> 
>
> Key: COMPRESS-583
> URL: https://issues.apache.org/jira/browse/COMPRESS-583
> Project: Commons Compress
>  Issue Type: Bug
>Affects Versions: 1.21
>Reporter: Chanseok Oh
>Priority: Major
>
> Upgrading {{commons-compress}} had always been generating the same compressed 
> output byte-to-byte for the same input (i.e., their SHA checksum didn't 
> change between versions). However, starting with 1.21, we noticed it's 
> generating different output than what previous versions are generating.
> We also noticed that the same code generates different binaries on different 
> OSes. For example, 1.21 on Linux is different from 1.21 on Mac.
> However, at least on the same OS, 1.21 seems to reproducibly generate the 
> same output.
> See the context at [https://github.com/GoogleContainerTools/jib/pull/3342]
> 
> *UPDATE*: running diffoscope reveals that 1.21 is picking up the user and 
> group of a local environment.
> (output below manually reformatted slightly for readability)
> {{$ diffoscope 
> 6d2763b0f3940d324ea6b55386429e5b173899608abf7d1bff62e25dd2e4dcea.tar.gz 
> 32258c626498c13412679442e3417811bc7ab801c6928da2c2a97e0bbc380a88.tar.gz}}
> {{--- 
> 6d2763b0f3940d324ea6b55386429e5b173899608abf7d1bff62e25dd2e4dcea.tar.gz}}
> {{+++ 
> 32258c626498c13412679442e3417811bc7ab801c6928da2c2a97e0bbc380a88.tar.gz}}
> {{│ --- 6d2763b0f3940d324ea6b55386429e5b173899608abf7d1bff62e25dd2e4dcea.tar}}
> {{├── +++ 
> 32258c626498c13412679442e3417811bc7ab801c6928da2c2a97e0bbc380a88.tar}}
> {{│ ├── file list}}
> {{│ │ @@ -1,3 +1,3 @@}}
> {{│ │ {color:#de350b}-drwxr-xr-x 0                 0          0 0 1970-01-01 
> 00:00:01.00 app/{color}}}
> {{│ │ {color:#00875a}+drwxr-xr-x 0 chanseok (252384) eng (5000) 0 1970-01-01 
> 00:00:01.00 app/{color}}}
> {{│ │ -rw-r--r--  0                 0          0 0 1970-01-01 00:00:01.00 
> app/fileB.txt}}
> {{│ │ -rw-r--r--  0                 0          0 0 1970-01-01 00:00:01.00 
> app/fileC.txt}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-583) 1.21 generates different output binaries compared to older versions as well as on different OSes

2021-07-30 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17390835#comment-17390835
 ] 

Peter Lee commented on COMPRESS-583:


Thanks for your explanations [~francium25].

Now I understand the importance of reproducibility. We didn't do much for 
reproducibility before - we don't have enough related tests. We will try to 
improve this. PRs are always welcome. :)

 

> _as I didn't find any relevant changelog for this behavioral change._

This is my bad. The PR #97 was not meant to introduce the uid/gid to 
TarArchiveEntry. I reviewed and merged this but I didn't notice this may cause 
a reproducibility problem - and I didn't record this in changelog.

 

As you mentioned, 1.21 is already released and this could not be reverted. But 
you're right about the changelog. I'm trying to update the release log of 1.21.

 

Thank you for your reporting again.

> 1.21 generates different output binaries compared to older versions as well 
> as on different OSes
> 
>
> Key: COMPRESS-583
> URL: https://issues.apache.org/jira/browse/COMPRESS-583
> Project: Commons Compress
>  Issue Type: Bug
>Affects Versions: 1.21
>Reporter: Chanseok Oh
>Priority: Major
>
> Upgrading {{commons-compress}} had always been generating the same compressed 
> output byte-to-byte for the same input (i.e., their SHA checksum didn't 
> change between versions). However, starting with 1.21, we noticed it's 
> generating different output than what previous versions are generating.
> We also noticed that the same code generates different binaries on different 
> OSes. For example, 1.21 on Linux is different from 1.21 on Mac.
> However, at least on the same OS, 1.21 seems to reproducibly generate the 
> same output.
> See the context at [https://github.com/GoogleContainerTools/jib/pull/3342]
> 
> *UPDATE*: running diffoscope reveals that 1.21 is picking up the user and 
> group of a local environment.
> (output below manually reformatted slightly for readability)
> {{$ diffoscope 
> 6d2763b0f3940d324ea6b55386429e5b173899608abf7d1bff62e25dd2e4dcea.tar.gz 
> 32258c626498c13412679442e3417811bc7ab801c6928da2c2a97e0bbc380a88.tar.gz}}
> {{--- 
> 6d2763b0f3940d324ea6b55386429e5b173899608abf7d1bff62e25dd2e4dcea.tar.gz}}
> {{+++ 
> 32258c626498c13412679442e3417811bc7ab801c6928da2c2a97e0bbc380a88.tar.gz}}
> {{│ --- 6d2763b0f3940d324ea6b55386429e5b173899608abf7d1bff62e25dd2e4dcea.tar}}
> {{├── +++ 
> 32258c626498c13412679442e3417811bc7ab801c6928da2c2a97e0bbc380a88.tar}}
> {{│ ├── file list}}
> {{│ │ @@ -1,3 +1,3 @@}}
> {{│ │ {color:#de350b}-drwxr-xr-x 0                 0          0 0 1970-01-01 
> 00:00:01.00 app/{color}}}
> {{│ │ {color:#00875a}+drwxr-xr-x 0 chanseok (252384) eng (5000) 0 1970-01-01 
> 00:00:01.00 app/{color}}}
> {{│ │ -rw-r--r--  0                 0          0 0 1970-01-01 00:00:01.00 
> app/fileB.txt}}
> {{│ │ -rw-r--r--  0                 0          0 0 1970-01-01 00:00:01.00 
> app/fileC.txt}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-583) 1.21 generates different output binaries compared to older versions as well as on different OSes

2021-07-29 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17390263#comment-17390263
 ] 

Peter Lee commented on COMPRESS-583:


Hi [~francium25] [~michael-o]

Sorry about the late reply and thank you for your reporting.

 

_Upgrading {{commons-compress}} had always been generating the same compressed 
output byte-to-byte for the same input (i.e., their SHA checksum didn't change 
between versions)._

 

Seems you are expecting Compress to generate exactly the same output in 
different versions and different OSes - IMHO this is not a breaking change. I 
will start a mail list thread to discuss about this.

And of course you are welcome to join the mail list discussions. :)

 

_Previously, it was always setting UID:GID as {{0:0}} by default_

Yes, the uid/gid was not set when you create a TarArchiveEntry before 1.21, and 
it was changed in 1.21.

 

_So my question is, did you really introduce this fix intentionally (if so, 
where's the doc?) or it happened to be like this (i.e., regression)?_

The COMPRESS-404 was not to meant to read the UID/GID in TarArchiveEntry. As 
[~michael-o] said, this is the default behavior of {{tar(1)}} but we didn't do 
it before 1.21, so I approved that PR.

 

_I agree that this is an issue, it has not been documented as such:_ 
[_https://commons.apache.org/proper/commons-compress/changes-report.html#a1.21_]

Yes. I think we didn't document this.

 

> 1.21 generates different output binaries compared to older versions as well 
> as on different OSes
> 
>
> Key: COMPRESS-583
> URL: https://issues.apache.org/jira/browse/COMPRESS-583
> Project: Commons Compress
>  Issue Type: Bug
>Affects Versions: 1.21
>Reporter: Chanseok Oh
>Priority: Major
>
> Upgrading {{commons-compress}} had always been generating the same compressed 
> output byte-to-byte for the same input (i.e., their SHA checksum didn't 
> change between versions). However, starting with 1.21, we noticed it's 
> generating different output than what previous versions are generating.
> We also noticed that the same code generates different binaries on different 
> OSes. For example, 1.21 on Linux is different from 1.21 on Mac.
> However, at least on the same OS, 1.21 seems to reproducibly generate the 
> same output.
> See the context at [https://github.com/GoogleContainerTools/jib/pull/3342]
> 
> *UPDATE*: running diffoscope reveals that 1.21 is picking up the user and 
> group of a local environment.
> (output below manually reformatted slightly for readability)
> {{$ diffoscope 
> 6d2763b0f3940d324ea6b55386429e5b173899608abf7d1bff62e25dd2e4dcea.tar.gz 
> 32258c626498c13412679442e3417811bc7ab801c6928da2c2a97e0bbc380a88.tar.gz}}
> {{--- 
> 6d2763b0f3940d324ea6b55386429e5b173899608abf7d1bff62e25dd2e4dcea.tar.gz}}
> {{+++ 
> 32258c626498c13412679442e3417811bc7ab801c6928da2c2a97e0bbc380a88.tar.gz}}
> {{│ --- 6d2763b0f3940d324ea6b55386429e5b173899608abf7d1bff62e25dd2e4dcea.tar}}
> {{├── +++ 
> 32258c626498c13412679442e3417811bc7ab801c6928da2c2a97e0bbc380a88.tar}}
> {{│ ├── file list}}
> {{│ │ @@ -1,3 +1,3 @@}}
> {{│ │ {color:#de350b}-drwxr-xr-x 0                 0          0 0 1970-01-01 
> 00:00:01.00 app/{color}}}
> {{│ │ {color:#00875a}+drwxr-xr-x 0 chanseok (252384) eng (5000) 0 1970-01-01 
> 00:00:01.00 app/{color}}}
> {{│ │ -rw-r--r--  0                 0          0 0 1970-01-01 00:00:01.00 
> app/fileB.txt}}
> {{│ │ -rw-r--r--  0                 0          0 0 1970-01-01 00:00:01.00 
> app/fileC.txt}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-581) Make getLocalHeaderOffset() public

2021-06-01 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17354964#comment-17354964
 ] 

Peter Lee commented on COMPRESS-581:


Hi [~franckval]

I think it's not a good idea to make _getLocalHeaderOffset_ public.

Most users of Commons Compress are not familiar with Zip specification. They 
have no ideas about Local File Header. We should avoid the implemention details.

If you do need to get the LFH offset, you can get it via debug - I think this 
is enough for zip file analysises.

> Make getLocalHeaderOffset() public
> --
>
> Key: COMPRESS-581
> URL: https://issues.apache.org/jira/browse/COMPRESS-581
> Project: Commons Compress
>  Issue Type: Wish
>Affects Versions: 1.20
>Reporter: Franck Valentin
>Priority: Minor
>
> Hi,
> I use this library to analyse zip files and get the entries and to do so I 
> need to get the entries together with their offsets within the archive. I use 
> ZipArchiveEntry but the issue is that getLocalHeaderOffset() is protected 
> instead of being public.
> Looking at the source code I don't understand the reason behind it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-574) Byte range support in archive creation

2021-04-17 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17324225#comment-17324225
 ] 

Peter Lee commented on COMPRESS-574:


I see. Thank you for your detailed information.

 

> _File.createTempFile how do you size the temporary directory ?_

We can avoid creating a file, by creating the zip in memory. You can see 
_ZipTestCase.testZipArchiveCreationInMemory_ for detailed implemention.

 

> _Also a third problem I did not mention is the time before the client receive 
>the first byte, we wait for the ZIP creation to be complete before creating 
>the ZipIS._

With _closeArchiveEntry()_ called, the LFH and entry data would have been 
written to the target file or byte array. I think the raw data can be 
transported then, and it's happened before the Zip is finished.

 

> _Recently download servers got overloaded, so I look again and still found 
> nothing. So I decided to look at the ZIP spec directly to check if it was 
> possible to stream. And it was, so I implemented it. I give the code here, so 
> someone like myself in the future will find it._

Thank you so much once again! And I'd like to hear others about your patch.

> Byte range support in archive creation
> --
>
> Key: COMPRESS-574
> URL: https://issues.apache.org/jira/browse/COMPRESS-574
> Project: Commons Compress
>  Issue Type: Improvement
>  Components: Archivers
>Reporter: Gaël Lalire
>Priority: Minor
> Attachments: DynamicZip.java, DynamicZipTest.java
>
>
> When you have a ZIP which contains _N_ components and you want to let the 
> user choose which components it needs, you need to create _2^N - 1_ ZIP.
> So the idea is to store each component once (or twice if you want both 
> deflated and stored version), and create the ZIP on the fly.
> For the moment you can stream with a ZipOutputStream but if you need an 
> InputStream things get a lot harder. I guess programs are writing the ZIP to 
> a file system and read from it after, so not really a streaming anymore.
> Also ZipOutputStream will never allow you to resume from a byte range, you 
> need to generate all previous data.
> So I made a class to do that, I think such functionality has its place in 
> commons compress.
> You can see my code attached and adapt it for better integration / other 
> archive type support or simply to get inspired.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-574) Byte range support in archive creation

2021-04-17 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17324195#comment-17324195
 ] 

Peter Lee commented on COMPRESS-574:


> _curl -r 200-500 download?zip_name=myzip&file_names=A,C_

For this case, why we need to read a specific range of bytes of entries from a 
zip?

I think most people will never use this.

> Byte range support in archive creation
> --
>
> Key: COMPRESS-574
> URL: https://issues.apache.org/jira/browse/COMPRESS-574
> Project: Commons Compress
>  Issue Type: Improvement
>  Components: Archivers
>Reporter: Gaël Lalire
>Priority: Minor
> Attachments: DynamicZip.java, DynamicZipTest.java
>
>
> When you have a ZIP which contains _N_ components and you want to let the 
> user choose which components it needs, you need to create _2^N - 1_ ZIP.
> So the idea is to store each component once (or twice if you want both 
> deflated and stored version), and create the ZIP on the fly.
> For the moment you can stream with a ZipOutputStream but if you need an 
> InputStream things get a lot harder. I guess programs are writing the ZIP to 
> a file system and read from it after, so not really a streaming anymore.
> Also ZipOutputStream will never allow you to resume from a byte range, you 
> need to generate all previous data.
> So I made a class to do that, I think such functionality has its place in 
> commons compress.
> You can see my code attached and adapt it for better integration / other 
> archive type support or simply to get inspired.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-574) Byte range support in archive creation

2021-04-17 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17324194#comment-17324194
 ] 

Peter Lee commented on COMPRESS-574:


> _So if a user request download?zip_name=myzip&file_names=A,C_

 

For this case, I think we can implement it with existing APIs like this :

 
{code:java}
@Test
public void pickEntriesFromZipAndCreateANewZip() throws IOException {
   final Set selectedEntries = new HashSet<>( Arrays.asList(new 
String[] {
  
"src/main/java/org/apache/commons/compress/archivers/zip/AbstractUnicodeExtraField.java",
  "src/main/java/org/apache/commons/compress/archivers/zip/UnixStat.java",
  
"src/main/java/org/apache/commons/compress/archivers/zip/ZipShort.java"}));

   final ZipFile zipFile = new ZipFile(getFile("ordertest.zip"), 
ZipEncodingHelper.UTF8);

   final File archive = File.createTempFile("test.", ".zip");;
   final ZipArchiveOutputStream zipArchiveOutputStream = new 
ZipArchiveOutputStream(archive);

   for (final ZipArchiveEntry entry : Collections.list(zipFile.getEntries())) {
  if (!selectedEntries.contains(entry.getName())) {
 continue;
  }

  zipArchiveOutputStream.putArchiveEntry(entry);
  final InputStream inputStream = zipFile.getInputStream(entry);
  IOUtils.copy(inputStream, zipArchiveOutputStream);
  inputStream.close();
  zipArchiveOutputStream.closeArchiveEntry();
   }

   zipArchiveOutputStream.close();

   // display
   final ZipArchiveInputStream zipArchiveInputStream = new 
ZipArchiveInputStream(new FileInputStream(archive));
   ArchiveEntry entry;
   while ((entry = zipArchiveInputStream.getNextEntry()) != null) {
  System.out.println(entry.getName());
   }
}
{code}
 

 

> Byte range support in archive creation
> --
>
> Key: COMPRESS-574
> URL: https://issues.apache.org/jira/browse/COMPRESS-574
> Project: Commons Compress
>  Issue Type: Improvement
>  Components: Archivers
>Reporter: Gaël Lalire
>Priority: Minor
> Attachments: DynamicZip.java, DynamicZipTest.java
>
>
> When you have a ZIP which contains _N_ components and you want to let the 
> user choose which components it needs, you need to create _2^N - 1_ ZIP.
> So the idea is to store each component once (or twice if you want both 
> deflated and stored version), and create the ZIP on the fly.
> For the moment you can stream with a ZipOutputStream but if you need an 
> InputStream things get a lot harder. I guess programs are writing the ZIP to 
> a file system and read from it after, so not really a streaming anymore.
> Also ZipOutputStream will never allow you to resume from a byte range, you 
> need to generate all previous data.
> So I made a class to do that, I think such functionality has its place in 
> commons compress.
> You can see my code attached and adapt it for better integration / other 
> archive type support or simply to get inspired.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-574) Byte range support in archive creation

2021-04-16 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17324146#comment-17324146
 ] 

Peter Lee commented on COMPRESS-574:


Thank you for your explanation.

 

So you are trying to do something like :
 # Picking part of entries from a zip and creating a new zip with these part of 
entries.
 # Have the ability to read part of raw data from the zip with part of entries.

And you are trying to do these things cause you are using some Amazon cloud 
service. Is it?

> Byte range support in archive creation
> --
>
> Key: COMPRESS-574
> URL: https://issues.apache.org/jira/browse/COMPRESS-574
> Project: Commons Compress
>  Issue Type: Improvement
>  Components: Archivers
>Reporter: Gaël Lalire
>Priority: Minor
> Attachments: DynamicZip.java, DynamicZipTest.java
>
>
> When you have a ZIP which contains _N_ components and you want to let the 
> user choose which components it needs, you need to create _2^N - 1_ ZIP.
> So the idea is to store each component once (or twice if you want both 
> deflated and stored version), and create the ZIP on the fly.
> For the moment you can stream with a ZipOutputStream but if you need an 
> InputStream things get a lot harder. I guess programs are writing the ZIP to 
> a file system and read from it after, so not really a streaming anymore.
> Also ZipOutputStream will never allow you to resume from a byte range, you 
> need to generate all previous data.
> So I made a class to do that, I think such functionality has its place in 
> commons compress.
> You can see my code attached and adapt it for better integration / other 
> archive type support or simply to get inspired.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (COMPRESS-574) Byte range support in archive creation

2021-04-16 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17322733#comment-17322733
 ] 

Peter Lee edited comment on COMPRESS-574 at 4/16/21, 9:04 AM:
--

Hi [~gaellalire]. Thank you for your contributing!

 

_When you have a ZIP which contains N components and you want to let the user 
choose which components it needs, you need to create 2^N - 1 ZIP._

_Also ZipOutputStream will never allow you to resume from a byte range, you 
need to generate all previous data._

 

I'm confused a little bit about the descriptions above. What's the problem you 
are trying to solve?


was (Author: peterlee):
Hi [~gaellalire]. Thank you for your contributing!

 

_When you have a ZIP which contains N components and you want to let the user 
choose which components it needs, you need to create 2^N - 1 ZIP._

_Also ZipOutputStream will never allow you to resume from a byte range, you 
need to generate all previous data._

 

 

I'm confused a little bit about the descriptions above. What's the problem you 
are trying to solve?

> Byte range support in archive creation
> --
>
> Key: COMPRESS-574
> URL: https://issues.apache.org/jira/browse/COMPRESS-574
> Project: Commons Compress
>  Issue Type: Improvement
>  Components: Archivers
>Reporter: Gaël Lalire
>Priority: Minor
> Attachments: DynamicZip.java, DynamicZipTest.java
>
>
> When you have a ZIP which contains _N_ components and you want to let the 
> user choose which components it needs, you need to create _2^N - 1_ ZIP.
> So the idea is to store each component once (or twice if you want both 
> deflated and stored version), and create the ZIP on the fly.
> For the moment you can stream with a ZipOutputStream but if you need an 
> InputStream things get a lot harder. I guess programs are writing the ZIP to 
> a file system and read from it after, so not really a streaming anymore.
> Also ZipOutputStream will never allow you to resume from a byte range, you 
> need to generate all previous data.
> So I made a class to do that, I think such functionality has its place in 
> commons compress.
> You can see my code attached and adapt it for better integration / other 
> archive type support or simply to get inspired.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-574) Byte range support in archive creation

2021-04-16 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17322733#comment-17322733
 ] 

Peter Lee commented on COMPRESS-574:


Hi [~gaellalire]. Thank you for your contributing!

 

_When you have a ZIP which contains N components and you want to let the user 
choose which components it needs, you need to create 2^N - 1 ZIP._

_Also ZipOutputStream will never allow you to resume from a byte range, you 
need to generate all previous data._

 

 

I'm confused a little bit about the descriptions above. What's the problem you 
are trying to solve?

> Byte range support in archive creation
> --
>
> Key: COMPRESS-574
> URL: https://issues.apache.org/jira/browse/COMPRESS-574
> Project: Commons Compress
>  Issue Type: Improvement
>  Components: Archivers
>Reporter: Gaël Lalire
>Priority: Minor
> Attachments: DynamicZip.java, DynamicZipTest.java
>
>
> When you have a ZIP which contains _N_ components and you want to let the 
> user choose which components it needs, you need to create _2^N - 1_ ZIP.
> So the idea is to store each component once (or twice if you want both 
> deflated and stored version), and create the ZIP on the fly.
> For the moment you can stream with a ZipOutputStream but if you need an 
> InputStream things get a lot harder. I guess programs are writing the ZIP to 
> a file system and read from it after, so not really a streaming anymore.
> Also ZipOutputStream will never allow you to resume from a byte range, you 
> need to generate all previous data.
> So I made a class to do that, I think such functionality has its place in 
> commons compress.
> You can see my code attached and adapt it for better integration / other 
> archive type support or simply to get inspired.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-572) Update zstd-jni dependency version to 1.4.9-1

2021-03-19 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17305287#comment-17305287
 ] 

Peter Lee commented on COMPRESS-572:


+1.

Zstd 1.4.8 is affected by CVE-2021-24032 - even through it's a problem in cli 
and I don't think zstd-jni is affected.

> Update zstd-jni dependency version to 1.4.9-1
> -
>
> Key: COMPRESS-572
> URL: https://issues.apache.org/jira/browse/COMPRESS-572
> Project: Commons Compress
>  Issue Type: New Feature
>Reporter: Michael Heuer
>Priority: Major
>
> Apache Avro and Parquet projects are releasing new versions with zstd-jni 
> dependency version at 1.4.9-1.  It would be desireable for the next version 
> of commons-compress to use the same version for compatibility reasons.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-571) 7z random access fails on shuffled entry list

2021-03-19 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17305286#comment-17305286
 ] 

Peter Lee commented on COMPRESS-571:


I don't see much difference between a copy and a unmodifiable list.

To be consistent with other implementions in Compress, I think returning a copy 
is a reasonable choice here.

> 7z random access fails on shuffled entry list
> -
>
> Key: COMPRESS-571
> URL: https://issues.apache.org/jira/browse/COMPRESS-571
> Project: Commons Compress
>  Issue Type: Bug
>Affects Versions: 1.20
>Reporter: Robin Schimpf
>Assignee: Peter Lee
>Priority: Major
>
> I noticed a problem on a 7z file and could reproduce the error if the 
> InputStream is retrieved after shuffling the entries.
> This test fails with a checksum verification error
> {code:java}
> @Test
> public void retrieveInputStreamForShuffledEntries() throws IOException {
> try (final SevenZFile sevenZFile = new 
> SevenZFile(getFile("COMPRESS-256.7z"))) {
> List entries = (List) 
> sevenZFile.getEntries();
> Collections.shuffle(entries);
> for (final SevenZArchiveEntry entry : entries) {
> IOUtils.toByteArray(sevenZFile.getInputStream(entry));
> }
> }
> }
> {code}
> This is the exception
> {code:java}
> java.io.IOException: Checksum verification failed
>   at 
> org.apache.commons.compress.utils.ChecksumVerifyingInputStream.read(ChecksumVerifyingInputStream.java:94)
>   at 
> org.apache.commons.compress.utils.ChecksumVerifyingInputStream.read(ChecksumVerifyingInputStream.java:74)
>   at org.apache.commons.compress.utils.IOUtils.copy(IOUtils.java:87)
>   at org.apache.commons.compress.utils.IOUtils.copy(IOUtils.java:62)
>   at 
> org.apache.commons.compress.utils.IOUtils.toByteArray(IOUtils.java:247)
>   at 
> org.apache.commons.compress.archivers.sevenz.SevenZFileTest.retrieveInputStreamForShuffledEntries(SevenZFileTest.java:616)
> {code}
> This also fails on the current master with the same error



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-571) 7z random access failes on shuffled entry list

2021-03-17 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17303178#comment-17303178
 ] 

Peter Lee commented on COMPRESS-571:


Fixed with 65a5c75c.

> 7z random access failes on shuffled entry list
> --
>
> Key: COMPRESS-571
> URL: https://issues.apache.org/jira/browse/COMPRESS-571
> Project: Commons Compress
>  Issue Type: Bug
>Affects Versions: 1.20
>Reporter: Robin Schimpf
>Assignee: Peter Lee
>Priority: Major
>
> I noticed a problem on a 7z file and could reproduce the error if the 
> InputStream is retrieved after shuffling the entries.
> This test fails with a checksum verification error
> {code:java}
> @Test
> public void retrieveInputStreamForShuffledEntries() throws IOException {
> try (final SevenZFile sevenZFile = new 
> SevenZFile(getFile("COMPRESS-256.7z"))) {
> List entries = (List) 
> sevenZFile.getEntries();
> Collections.shuffle(entries);
> for (final SevenZArchiveEntry entry : entries) {
> IOUtils.toByteArray(sevenZFile.getInputStream(entry));
> }
> }
> }
> {code}
> This is the exception
> {code:java}
> java.io.IOException: Checksum verification failed
>   at 
> org.apache.commons.compress.utils.ChecksumVerifyingInputStream.read(ChecksumVerifyingInputStream.java:94)
>   at 
> org.apache.commons.compress.utils.ChecksumVerifyingInputStream.read(ChecksumVerifyingInputStream.java:74)
>   at org.apache.commons.compress.utils.IOUtils.copy(IOUtils.java:87)
>   at org.apache.commons.compress.utils.IOUtils.copy(IOUtils.java:62)
>   at 
> org.apache.commons.compress.utils.IOUtils.toByteArray(IOUtils.java:247)
>   at 
> org.apache.commons.compress.archivers.sevenz.SevenZFileTest.retrieveInputStreamForShuffledEntries(SevenZFileTest.java:616)
> {code}
> This also fails on the current master with the same error



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (COMPRESS-571) 7z random access failes on shuffled entry list

2021-03-17 Thread Peter Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/COMPRESS-571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Lee updated COMPRESS-571:
---
Assignee: Peter Lee

> 7z random access failes on shuffled entry list
> --
>
> Key: COMPRESS-571
> URL: https://issues.apache.org/jira/browse/COMPRESS-571
> Project: Commons Compress
>  Issue Type: Bug
>Affects Versions: 1.20
>Reporter: Robin Schimpf
>Assignee: Peter Lee
>Priority: Major
>
> I noticed a problem on a 7z file and could reproduce the error if the 
> InputStream is retrieved after shuffling the entries.
> This test fails with a checksum verification error
> {code:java}
> @Test
> public void retrieveInputStreamForShuffledEntries() throws IOException {
> try (final SevenZFile sevenZFile = new 
> SevenZFile(getFile("COMPRESS-256.7z"))) {
> List entries = (List) 
> sevenZFile.getEntries();
> Collections.shuffle(entries);
> for (final SevenZArchiveEntry entry : entries) {
> IOUtils.toByteArray(sevenZFile.getInputStream(entry));
> }
> }
> }
> {code}
> This is the exception
> {code:java}
> java.io.IOException: Checksum verification failed
>   at 
> org.apache.commons.compress.utils.ChecksumVerifyingInputStream.read(ChecksumVerifyingInputStream.java:94)
>   at 
> org.apache.commons.compress.utils.ChecksumVerifyingInputStream.read(ChecksumVerifyingInputStream.java:74)
>   at org.apache.commons.compress.utils.IOUtils.copy(IOUtils.java:87)
>   at org.apache.commons.compress.utils.IOUtils.copy(IOUtils.java:62)
>   at 
> org.apache.commons.compress.utils.IOUtils.toByteArray(IOUtils.java:247)
>   at 
> org.apache.commons.compress.archivers.sevenz.SevenZFileTest.retrieveInputStreamForShuffledEntries(SevenZFileTest.java:616)
> {code}
> This also fails on the current master with the same error



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-571) 7z random access failes on shuffled entry list

2021-03-16 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17303019#comment-17303019
 ] 

Peter Lee commented on COMPRESS-571:


With returning a copy of the entries, I can successfully finish this test.

> 7z random access failes on shuffled entry list
> --
>
> Key: COMPRESS-571
> URL: https://issues.apache.org/jira/browse/COMPRESS-571
> Project: Commons Compress
>  Issue Type: Bug
>Affects Versions: 1.20
>Reporter: Robin Schimpf
>Priority: Major
>
> I noticed a problem on a 7z file and could reproduce the error if the 
> InputStream is retrieved after shuffling the entries.
> This test fails with a checksum verification error
> {code:java}
> @Test
> public void retrieveInputStreamForShuffledEntries() throws IOException {
> try (final SevenZFile sevenZFile = new 
> SevenZFile(getFile("COMPRESS-256.7z"))) {
> List entries = (List) 
> sevenZFile.getEntries();
> Collections.shuffle(entries);
> for (final SevenZArchiveEntry entry : entries) {
> IOUtils.toByteArray(sevenZFile.getInputStream(entry));
> }
> }
> }
> {code}
> This is the exception
> {code:java}
> java.io.IOException: Checksum verification failed
>   at 
> org.apache.commons.compress.utils.ChecksumVerifyingInputStream.read(ChecksumVerifyingInputStream.java:94)
>   at 
> org.apache.commons.compress.utils.ChecksumVerifyingInputStream.read(ChecksumVerifyingInputStream.java:74)
>   at org.apache.commons.compress.utils.IOUtils.copy(IOUtils.java:87)
>   at org.apache.commons.compress.utils.IOUtils.copy(IOUtils.java:62)
>   at 
> org.apache.commons.compress.utils.IOUtils.toByteArray(IOUtils.java:247)
>   at 
> org.apache.commons.compress.archivers.sevenz.SevenZFileTest.retrieveInputStreamForShuffledEntries(SevenZFileTest.java:616)
> {code}
> This also fails on the current master with the same error



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-571) 7z random access failes on shuffled entry list

2021-03-16 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17303018#comment-17303018
 ] 

Peter Lee commented on COMPRESS-571:


The _getEntries()_ returns exactly the same list of entries so the shuffle may 
also affect the same amout of entries.

Maybe we should return a copy of entries in _getEntries()_ ?

> 7z random access failes on shuffled entry list
> --
>
> Key: COMPRESS-571
> URL: https://issues.apache.org/jira/browse/COMPRESS-571
> Project: Commons Compress
>  Issue Type: Bug
>Affects Versions: 1.20
>Reporter: Robin Schimpf
>Priority: Major
>
> I noticed a problem on a 7z file and could reproduce the error if the 
> InputStream is retrieved after shuffling the entries.
> This test fails with a checksum verification error
> {code:java}
> @Test
> public void retrieveInputStreamForShuffledEntries() throws IOException {
> try (final SevenZFile sevenZFile = new 
> SevenZFile(getFile("COMPRESS-256.7z"))) {
> List entries = (List) 
> sevenZFile.getEntries();
> Collections.shuffle(entries);
> for (final SevenZArchiveEntry entry : entries) {
> IOUtils.toByteArray(sevenZFile.getInputStream(entry));
> }
> }
> }
> {code}
> This is the exception
> {code:java}
> java.io.IOException: Checksum verification failed
>   at 
> org.apache.commons.compress.utils.ChecksumVerifyingInputStream.read(ChecksumVerifyingInputStream.java:94)
>   at 
> org.apache.commons.compress.utils.ChecksumVerifyingInputStream.read(ChecksumVerifyingInputStream.java:74)
>   at org.apache.commons.compress.utils.IOUtils.copy(IOUtils.java:87)
>   at org.apache.commons.compress.utils.IOUtils.copy(IOUtils.java:62)
>   at 
> org.apache.commons.compress.utils.IOUtils.toByteArray(IOUtils.java:247)
>   at 
> org.apache.commons.compress.archivers.sevenz.SevenZFileTest.retrieveInputStreamForShuffledEntries(SevenZFileTest.java:616)
> {code}
> This also fails on the current master with the same error



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-557) Have some kind of reset() in LZ77Compressor

2021-03-07 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17297133#comment-17297133
 ] 

Peter Lee commented on COMPRESS-557:


I was testing with LZ77 and I want to have several files compressed using 
LZ77Compressor. I found that I have to use a new _LZ77Compressor_ class 
instance __ every time when compressing a new file.

I was thinking that we could have a _reset()_ in LZ77Compressor so we can use 
the same LZ77Compressor instance withing making a new one.

This may not be used for most of the time as LZ77 is used as a compressor now.

> Have some kind of reset() in LZ77Compressor
> ---
>
> Key: COMPRESS-557
> URL: https://issues.apache.org/jira/browse/COMPRESS-557
> Project: Commons Compress
>  Issue Type: New Feature
>Reporter: Peter Lee
>Priority: Minor
>
> It would be useful if we have a _reset()_ in LZ77Compressor



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-539) TarArchiveInputStream allocates a lot of memory when iterating through an archive

2021-03-07 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17297123#comment-17297123
 ] 

Peter Lee commented on COMPRESS-539:


I think we can close this issue now.

> TarArchiveInputStream allocates a lot of memory when iterating through an 
> archive
> -
>
> Key: COMPRESS-539
> URL: https://issues.apache.org/jira/browse/COMPRESS-539
> Project: Commons Compress
>  Issue Type: Bug
>Affects Versions: 1.20
>Reporter: Robin Schimpf
>Assignee: Peter Lee
>Priority: Major
> Attachments: Don't_call_InputStream#skip.patch, 
> Reuse_recordBuffer.patch, image-2020-06-21-10-58-07-917.png, 
> image-2020-06-21-10-58-43-255.png, image-2020-06-21-10-59-10-825.png, 
> image-2020-07-05-22-10-07-402.png, image-2020-07-05-22-11-25-526.png, 
> image-2020-07-05-22-32-15-131.png, image-2020-07-05-22-32-31-511.png
>
>
>  I iterated through the linux source tar and noticed some unneeded 
> allocations happen without extracting any data.
> Reproducing code
> {code:java}
> File tarFile = new File("linux-5.7.1.tar");
> try (TarArchiveInputStream in = new 
> TarArchiveInputStream(Files.newInputStream(tarFile.toPath( {
> TarArchiveEntry entry;
> while ((entry = in.getNextTarEntry()) != null) {
> }
> }
> {code}
> The measurement was done on Java 11.0.7 with the Java Flight Recorder. 
> Options used: 
> -XX:StartFlightRecording=settings=profile,filename=allocations.jfr
> Baseline with the current master implementation:
>  Estimated TLAB allocation: 293MiB
> !image-2020-06-21-10-58-07-917.png!
> 1. IOUtils.skip -> input.skip(numToSkip)
>  This delegates in my test scenario to the InputStream.skip implementation 
> which allocates a new byte[] for every invocation. By simply commenting out 
> the while loop which calls the skip method the estimated TLAB allocation 
> drops to 164MiB (-129MiB).
>  !image-2020-06-21-10-58-43-255.png! 
>  Commenting out the skip call does not seem to be the best solution but it 
> was quick for me to see how much memory can be saved. Also no unit tests 
> where failing for me.
> 2. TarArchiveInputStream.readRecord
>  For every read of the record a new byte[] is created. Since the record size 
> does not change the byte[] can be reused and created when instantiating the 
> TarStream. This optimization is already present in the 
> TarArchiveOutputStream. Reusing the buffer reduces the estimated TLAB 
> allocations further to 128MiB (-36MiB).
>  !image-2020-06-21-10-59-10-825.png!
> I attached the patches I used so the results can be verified.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-567) IllegalArgumentException in ZipFile.positionAtCentralDirectory

2021-02-26 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17291491#comment-17291491
 ] 

Peter Lee commented on COMPRESS-567:


I see. Thank you for you explanation.

> IllegalArgumentException in ZipFile.positionAtCentralDirectory
> --
>
> Key: COMPRESS-567
> URL: https://issues.apache.org/jira/browse/COMPRESS-567
> Project: Commons Compress
>  Issue Type: Bug
>Affects Versions: 1.20
>Reporter: Fabian Meumertzheim
>Priority: Major
> Attachments: crash.zip
>
>
> The following snippet of code throws an undeclared IllegalArgumentException:
> {code:java}
> byte[] bytes = Base64.getDecoder().decode("UEsFBgAAAQD//1AAJP9QAA==");
> SeekableInMemoryByteChannel input = new SeekableInMemoryByteChannel(bytes);
> try {
> ZipFile file = new ZipFile(input);
> } catch (IOException ignored) {}
> {code}
> The stack trace is:
> {noformat}
> java.lang.IllegalArgumentException: Position has to be in range 0.. 2147483647
>   at 
> org.apache.commons.compress.utils.SeekableInMemoryByteChannel.position(SeekableInMemoryByteChannel.java:94)
>   at 
> org.apache.commons.compress.archivers.zip.ZipFile.positionAtCentralDirectory32(ZipFile.java:1128)
>   at 
> org.apache.commons.compress.archivers.zip.ZipFile.positionAtCentralDirectory(ZipFile.java:1037)
>   at 
> org.apache.commons.compress.archivers.zip.ZipFile.populateFromCentralDirectory(ZipFile.java:702)
>   at 
> org.apache.commons.compress.archivers.zip.ZipFile.(ZipFile.java:371)
>   at 
> org.apache.commons.compress.archivers.zip.ZipFile.(ZipFile.java:318)
>   at 
> org.apache.commons.compress.archivers.zip.ZipFile.(ZipFile.java:274)
> {noformat}
> I also attached the input as a ZIP file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-567) IllegalArgumentException in ZipFile.positionAtCentralDirectory

2021-02-26 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17291474#comment-17291474
 ] 

Peter Lee commented on COMPRESS-567:


Just curious about the test : how is the test file 
_Base64.getDecoder().decode("UEsFBgAAAQD//1AAJP9QAA==")_ generated? 

I believe this zip is generated by some fuzzer, and you encoded it with Base64 
to simpify the test. Am I right?

> IllegalArgumentException in ZipFile.positionAtCentralDirectory
> --
>
> Key: COMPRESS-567
> URL: https://issues.apache.org/jira/browse/COMPRESS-567
> Project: Commons Compress
>  Issue Type: Bug
>Affects Versions: 1.20
>Reporter: Fabian Meumertzheim
>Priority: Major
> Attachments: crash.zip
>
>
> The following snippet of code throws an undeclared IllegalArgumentException:
> {code:java}
> byte[] bytes = Base64.getDecoder().decode("UEsFBgAAAQD//1AAJP9QAA==");
> SeekableInMemoryByteChannel input = new SeekableInMemoryByteChannel(bytes);
> try {
> ZipFile file = new ZipFile(input);
> } catch (IOException ignored) {}
> {code}
> The stack trace is:
> {noformat}
> java.lang.IllegalArgumentException: Position has to be in range 0.. 2147483647
>   at 
> org.apache.commons.compress.utils.SeekableInMemoryByteChannel.position(SeekableInMemoryByteChannel.java:94)
>   at 
> org.apache.commons.compress.archivers.zip.ZipFile.positionAtCentralDirectory32(ZipFile.java:1128)
>   at 
> org.apache.commons.compress.archivers.zip.ZipFile.positionAtCentralDirectory(ZipFile.java:1037)
>   at 
> org.apache.commons.compress.archivers.zip.ZipFile.populateFromCentralDirectory(ZipFile.java:702)
>   at 
> org.apache.commons.compress.archivers.zip.ZipFile.(ZipFile.java:371)
>   at 
> org.apache.commons.compress.archivers.zip.ZipFile.(ZipFile.java:318)
>   at 
> org.apache.commons.compress.archivers.zip.ZipFile.(ZipFile.java:274)
> {noformat}
> I also attached the input as a ZIP file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (COMPRESS-568) NullPointerException in X5455_ExtendedTimestamp.getLocalFileDataData

2021-02-25 Thread Peter Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/COMPRESS-568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Lee resolved COMPRESS-568.

Fix Version/s: 1.21
   Resolution: Fixed

> NullPointerException in X5455_ExtendedTimestamp.getLocalFileDataData
> 
>
> Key: COMPRESS-568
> URL: https://issues.apache.org/jira/browse/COMPRESS-568
> Project: Commons Compress
>  Issue Type: Bug
>Affects Versions: 1.20
>Reporter: Fabian Meumertzheim
>Priority: Major
> Fix For: 1.21
>
> Attachments: npe.zip
>
>
> The following snippet of code throws an undeclared NullPointerException:
> {code:java}
> byte[] bytes = 
> base64.getDecoder().decode("UEsDBAoACQAAAGu0ukYdiHewEwcDABwAYWFhVVQDAAn5ygAAUEsFBgAC/0IABQAAUEsBAh4DCgAJAP8ABbS6RgAAIAAHAAABAABQSwUGAP///wU=");
>  
> SeekableInMemoryByteChannel input = new SeekableInMemoryByteChannel(bytes); 
> try { ZipFile file = new ZipFile(input); } catch (IOException ignored) {}
> {code}
> The stack trace is:
> {noformat}
> java.lang.NullPointerException: Cannot invoke 
> "org.apache.commons.compress.archivers.zip.ZipLong.getBytes()" because 
> "this.modifyTime" is null
>   at 
> org.apache.commons.compress.archivers.zip.X5455_ExtendedTimestamp.getLocalFileDataData(X5455_ExtendedTimestamp.java:180)
>   at 
> org.apache.commons.compress.archivers.zip.ExtraFieldUtils.mergeLocalFileDataData(ExtraFieldUtils.java:250)
>   at 
> org.apache.commons.compress.archivers.zip.ZipArchiveEntry.setExtra(ZipArchiveEntry.java:691)
>   at 
> org.apache.commons.compress.archivers.zip.ZipArchiveEntry.addExtraField(ZipArchiveEntry.java:573)
>   at 
> org.apache.commons.compress.archivers.zip.ZipArchiveEntry.mergeExtraFields(ZipArchiveEntry.java:903)
>   at 
> org.apache.commons.compress.archivers.zip.ZipArchiveEntry.setExtra(ZipArchiveEntry.java:676)
>   at 
> org.apache.commons.compress.archivers.zip.ZipFile.resolveLocalFileHeaderData(ZipFile.java:1237)
>   at 
> org.apache.commons.compress.archivers.zip.ZipFile.(ZipFile.java:373)
>   at 
> org.apache.commons.compress.archivers.zip.ZipFile.(ZipFile.java:318)
>   at 
> org.apache.commons.compress.archivers.zip.ZipFile.(ZipFile.java:274)
> {noformat}
> I also attached the input as a ZIP file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-568) NullPointerException in X5455_ExtendedTimestamp.getLocalFileDataData

2021-02-25 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17291318#comment-17291318
 ] 

Peter Lee commented on COMPRESS-568:


That's fine. Thank you for reporting this. :)

> NullPointerException in X5455_ExtendedTimestamp.getLocalFileDataData
> 
>
> Key: COMPRESS-568
> URL: https://issues.apache.org/jira/browse/COMPRESS-568
> Project: Commons Compress
>  Issue Type: Bug
>Affects Versions: 1.20
>Reporter: Fabian Meumertzheim
>Priority: Major
> Attachments: npe.zip
>
>
> The following snippet of code throws an undeclared NullPointerException:
> {code:java}
> byte[] bytes = 
> base64.getDecoder().decode("UEsDBAoACQAAAGu0ukYdiHewEwcDABwAYWFhVVQDAAn5ygAAUEsFBgAC/0IABQAAUEsBAh4DCgAJAP8ABbS6RgAAIAAHAAABAABQSwUGAP///wU=");
>  
> SeekableInMemoryByteChannel input = new SeekableInMemoryByteChannel(bytes); 
> try { ZipFile file = new ZipFile(input); } catch (IOException ignored) {}
> {code}
> The stack trace is:
> {noformat}
> java.lang.NullPointerException: Cannot invoke 
> "org.apache.commons.compress.archivers.zip.ZipLong.getBytes()" because 
> "this.modifyTime" is null
>   at 
> org.apache.commons.compress.archivers.zip.X5455_ExtendedTimestamp.getLocalFileDataData(X5455_ExtendedTimestamp.java:180)
>   at 
> org.apache.commons.compress.archivers.zip.ExtraFieldUtils.mergeLocalFileDataData(ExtraFieldUtils.java:250)
>   at 
> org.apache.commons.compress.archivers.zip.ZipArchiveEntry.setExtra(ZipArchiveEntry.java:691)
>   at 
> org.apache.commons.compress.archivers.zip.ZipArchiveEntry.addExtraField(ZipArchiveEntry.java:573)
>   at 
> org.apache.commons.compress.archivers.zip.ZipArchiveEntry.mergeExtraFields(ZipArchiveEntry.java:903)
>   at 
> org.apache.commons.compress.archivers.zip.ZipArchiveEntry.setExtra(ZipArchiveEntry.java:676)
>   at 
> org.apache.commons.compress.archivers.zip.ZipFile.resolveLocalFileHeaderData(ZipFile.java:1237)
>   at 
> org.apache.commons.compress.archivers.zip.ZipFile.(ZipFile.java:373)
>   at 
> org.apache.commons.compress.archivers.zip.ZipFile.(ZipFile.java:318)
>   at 
> org.apache.commons.compress.archivers.zip.ZipFile.(ZipFile.java:274)
> {noformat}
> I also attached the input as a ZIP file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-567) IllegalArgumentException in ZipFile.positionAtCentralDirectory

2021-02-25 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17290875#comment-17290875
 ] 

Peter Lee commented on COMPRESS-567:


The offset of start of CFH is a 4 bytes unsigned value with a maximum of 2^32 - 
1 = 4,294,967,295, which may exceeds the allowed range of 
SeekableInMemoryByteChannel.

So you are expecting some other exception instead of IllegalArgumentException, 
is it?

> IllegalArgumentException in ZipFile.positionAtCentralDirectory
> --
>
> Key: COMPRESS-567
> URL: https://issues.apache.org/jira/browse/COMPRESS-567
> Project: Commons Compress
>  Issue Type: Bug
>Affects Versions: 1.20
>Reporter: Fabian Meumertzheim
>Priority: Major
> Attachments: crash.zip
>
>
> The following snippet of code throws an undeclared IllegalArgumentException:
> {code:java}
> byte[] bytes = Base64.getDecoder().decode("UEsFBgAAAQD//1AAJP9QAA==");
> SeekableInMemoryByteChannel input = new SeekableInMemoryByteChannel(bytes);
> try {
> ZipFile file = new ZipFile(input);
> } catch (IOException ignored) {}
> {code}
> The stack trace is:
> {noformat}
> java.lang.IllegalArgumentException: Position has to be in range 0.. 2147483647
>   at 
> org.apache.commons.compress.utils.SeekableInMemoryByteChannel.position(SeekableInMemoryByteChannel.java:94)
>   at 
> org.apache.commons.compress.archivers.zip.ZipFile.positionAtCentralDirectory32(ZipFile.java:1128)
>   at 
> org.apache.commons.compress.archivers.zip.ZipFile.positionAtCentralDirectory(ZipFile.java:1037)
>   at 
> org.apache.commons.compress.archivers.zip.ZipFile.populateFromCentralDirectory(ZipFile.java:702)
>   at 
> org.apache.commons.compress.archivers.zip.ZipFile.(ZipFile.java:371)
>   at 
> org.apache.commons.compress.archivers.zip.ZipFile.(ZipFile.java:318)
>   at 
> org.apache.commons.compress.archivers.zip.ZipFile.(ZipFile.java:274)
> {noformat}
> I also attached the input as a ZIP file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-568) NullPointerException in X5455_ExtendedTimestamp.getLocalFileDataData

2021-02-25 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17290866#comment-17290866
 ] 

Peter Lee commented on COMPRESS-568:


Hi, Fabian.

I have tested this latest Commons Compress code and receive no NPE.

I think this issue has already been fixed.

> NullPointerException in X5455_ExtendedTimestamp.getLocalFileDataData
> 
>
> Key: COMPRESS-568
> URL: https://issues.apache.org/jira/browse/COMPRESS-568
> Project: Commons Compress
>  Issue Type: Bug
>Affects Versions: 1.20
>Reporter: Fabian Meumertzheim
>Priority: Major
> Attachments: npe.zip
>
>
> The following snippet of code throws an undeclared NullPointerException:
> {code:java}
> byte[] bytes = 
> base64.getDecoder().decode("UEsDBAoACQAAAGu0ukYdiHewEwcDABwAYWFhVVQDAAn5ygAAUEsFBgAC/0IABQAAUEsBAh4DCgAJAP8ABbS6RgAAIAAHAAABAABQSwUGAP///wU=");
>  
> SeekableInMemoryByteChannel input = new SeekableInMemoryByteChannel(bytes); 
> try { ZipFile file = new ZipFile(input); } catch (IOException ignored) {}
> {code}
> The stack trace is:
> {noformat}
> java.lang.NullPointerException: Cannot invoke 
> "org.apache.commons.compress.archivers.zip.ZipLong.getBytes()" because 
> "this.modifyTime" is null
>   at 
> org.apache.commons.compress.archivers.zip.X5455_ExtendedTimestamp.getLocalFileDataData(X5455_ExtendedTimestamp.java:180)
>   at 
> org.apache.commons.compress.archivers.zip.ExtraFieldUtils.mergeLocalFileDataData(ExtraFieldUtils.java:250)
>   at 
> org.apache.commons.compress.archivers.zip.ZipArchiveEntry.setExtra(ZipArchiveEntry.java:691)
>   at 
> org.apache.commons.compress.archivers.zip.ZipArchiveEntry.addExtraField(ZipArchiveEntry.java:573)
>   at 
> org.apache.commons.compress.archivers.zip.ZipArchiveEntry.mergeExtraFields(ZipArchiveEntry.java:903)
>   at 
> org.apache.commons.compress.archivers.zip.ZipArchiveEntry.setExtra(ZipArchiveEntry.java:676)
>   at 
> org.apache.commons.compress.archivers.zip.ZipFile.resolveLocalFileHeaderData(ZipFile.java:1237)
>   at 
> org.apache.commons.compress.archivers.zip.ZipFile.(ZipFile.java:373)
>   at 
> org.apache.commons.compress.archivers.zip.ZipFile.(ZipFile.java:318)
>   at 
> org.apache.commons.compress.archivers.zip.ZipFile.(ZipFile.java:274)
> {noformat}
> I also attached the input as a ZIP file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-565) Regression - Corrupted headers when using 64 bit ZipArchiveOutputStream

2021-02-24 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17289867#comment-17289867
 ] 

Peter Lee commented on COMPRESS-565:


Hi [~evgenii.bovykin]

Please have a look if my PR works or not : 

[https://github.com/PeterAlfredLee/commons-compress.git] with branch 
COMPRESS-565

I tested on my Windows with Expand-Archive/7z and was successfully extracted.

> Regression - Corrupted headers when using 64 bit ZipArchiveOutputStream
> ---
>
> Key: COMPRESS-565
> URL: https://issues.apache.org/jira/browse/COMPRESS-565
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Affects Versions: 1.20
>Reporter: Evgenii Bovykin
>Assignee: Peter Lee
>Priority: Major
> Attachments: commons-compress-1.21-SNAPSHOT.jar, 
> image-2021-02-20-15-51-21-747.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> We've recently updated commons-compress library from version 1.9 to 1.20 and 
> now experiencing the problem that didn't occur before.
>  
> When using ZipArchiveOutputStream to archive 5Gb file and setting the 
> following fields
> {{output.setUseZip64(Zip64Mode.Always)}}
>  
> {{output.setCreateUnicodeExtraFields(ZipArchiveOutputStream.UnicodeExtraFieldPolicy.ALWAYS)}}
> resulting archive contains corrupted headers.
> *Expand-Archive Powershell utility cannot extract the archive at all with the 
> error about corrupted header. 7zip also complains about it, but can extract 
> the archive.*
>  
> The problem didn't appear when using library version 1.9.
>  
> I've created a sample project that reproduces the error - 
> [https://github.com/missingdays/commons-compress-example]
> Issue doesn't reproduce if you do any of the following:
>  
>  # Downgrade library to version 1.9
>  # Remove 
> output.setCreateUnicodeExtraFields(ZipArchiveOutputStream.UnicodeExtraFieldPolicy.ALWAYS)
>  # Remove output.setUseZip64(Zip64Mode.Always) and zip smaller file (e.g. 1Gb)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-565) Regression - Corrupted headers when using 64 bit ZipArchiveOutputStream

2021-02-21 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17288128#comment-17288128
 ] 

Peter Lee commented on COMPRESS-565:


> Using the attached jar helps my issue. Expand-Archive successfully extracts 
>the archive, 7z doesn't complain about headers.

Great. I didn't expact that Expand-Archive and 7z are reading CFH.

 

> [~peterlee]no problem if you want to port it, but I can do so as well.

I can do this with a GH pull request.:)

> Regression - Corrupted headers when using 64 bit ZipArchiveOutputStream
> ---
>
> Key: COMPRESS-565
> URL: https://issues.apache.org/jira/browse/COMPRESS-565
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Affects Versions: 1.20
>Reporter: Evgenii Bovykin
>Assignee: Peter Lee
>Priority: Major
> Attachments: commons-compress-1.21-SNAPSHOT.jar, 
> image-2021-02-20-15-51-21-747.png
>
>
> We've recently updated commons-compress library from version 1.9 to 1.20 and 
> now experiencing the problem that didn't occur before.
>  
> When using ZipArchiveOutputStream to archive 5Gb file and setting the 
> following fields
> {{output.setUseZip64(Zip64Mode.Always)}}
>  
> {{output.setCreateUnicodeExtraFields(ZipArchiveOutputStream.UnicodeExtraFieldPolicy.ALWAYS)}}
> resulting archive contains corrupted headers.
> *Expand-Archive Powershell utility cannot extract the archive at all with the 
> error about corrupted header. 7zip also complains about it, but can extract 
> the archive.*
>  
> The problem didn't appear when using library version 1.9.
>  
> I've created a sample project that reproduces the error - 
> [https://github.com/missingdays/commons-compress-example]
> Issue doesn't reproduce if you do any of the following:
>  
>  # Downgrade library to version 1.9
>  # Remove 
> output.setCreateUnicodeExtraFields(ZipArchiveOutputStream.UnicodeExtraFieldPolicy.ALWAYS)
>  # Remove output.setUseZip64(Zip64Mode.Always) and zip smaller file (e.g. 1Gb)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-565) Regression - Corrupted headers when using 64 bit ZipArchiveOutputStream

2021-02-20 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17287847#comment-17287847
 ] 

Peter Lee commented on COMPRESS-565:


> this wouldn't be sensitive to the unicode extra field, but I really suspect 
>the lfh offset inside of the central directory to be what causes the problem.

 

Interesting. I can give it a try tomorrow, and I would port out the result then.

> Regression - Corrupted headers when using 64 bit ZipArchiveOutputStream
> ---
>
> Key: COMPRESS-565
> URL: https://issues.apache.org/jira/browse/COMPRESS-565
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Affects Versions: 1.20
>Reporter: Evgenii Bovykin
>Assignee: Peter Lee
>Priority: Major
> Attachments: commons-compress-1.21-SNAPSHOT.jar, 
> image-2021-02-20-15-51-21-747.png
>
>
> We've recently updated commons-compress library from version 1.9 to 1.20 and 
> now experiencing the problem that didn't occur before.
>  
> When using ZipArchiveOutputStream to archive 5Gb file and setting the 
> following fields
> {{output.setUseZip64(Zip64Mode.Always)}}
>  
> {{output.setCreateUnicodeExtraFields(ZipArchiveOutputStream.UnicodeExtraFieldPolicy.ALWAYS)}}
> resulting archive contains corrupted headers.
> *Expand-Archive Powershell utility cannot extract the archive at all with the 
> error about corrupted header. 7zip also complains about it, but can extract 
> the archive.*
>  
> The problem didn't appear when using library version 1.9.
>  
> I've created a sample project that reproduces the error - 
> [https://github.com/missingdays/commons-compress-example]
> Issue doesn't reproduce if you do any of the following:
>  
>  # Downgrade library to version 1.9
>  # Remove 
> output.setCreateUnicodeExtraFields(ZipArchiveOutputStream.UnicodeExtraFieldPolicy.ALWAYS)
>  # Remove output.setUseZip64(Zip64Mode.Always) and zip smaller file (e.g. 1Gb)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-565) Regression - Corrupted headers when using 64 bit ZipArchiveOutputStream

2021-02-20 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17287589#comment-17287589
 ] 

Peter Lee commented on COMPRESS-565:


I'm not familiar with *Expand-Archive Powershell utility*. Is it open sourced 
or not? I can't find anything on google.

7zip is open sourced but I'm not familiar with its code.:(

The difference between using 
_output.setCreateUnicodeExtraFields(ZipArchiveOutputStream.UnicodeExtraFieldPolicy.ALWAYS)_
 or not using it is:

whether we are adding the extra field _Info-ZIP Unicode Path Extra Field_ in 
the extra field or not.  And I think the reason why 7z is complaining and 
*Expand-Archive Powershell utility* on Windows can't extract the archive is : 
*_Info-ZIP Unicode Path Extra Field_ is not supported by them*.

See also: sector 4.6.9 of [zip 
APPNOTE|https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT] for more 
detailed information

 

With _ZipArchiveOutputStream.UnicodeExtraFieldPolicy.ALWAYS_ being set, we will 
always add the _Info-ZIP Unicode Path Extra Field_, which can be seen in the 
generated zip:

!image-2021-02-20-15-51-21-747.png!

I can make some simple explanations :

First of all, zip format is using little endian.

The first 2 bytes 0x7075 is the signature of _Info-ZIP Unicode Path Extra 
Field_.And the 0x000e is the size of this field, which is 14.

The 0x01 is the version of this extra field, which is always 1 now(according to 
the [zip APPNOTE|https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT]).

The 4 bytes 0x7df6c07c is the CRC32 checksum of the file name(which can be 
easily checked with any CRC32 check tools using the name _input.bin_).

The 9 bytes 0x69 6e 70 75 74 2e 62 69 6e is the UTF-8 value of the file name, 
which is _input.bin_.

You can see that 9 + 4 + 1 = 14 is exactly the length of this field I 
mentioned. So I think we have built a correct _Info-ZIP Unicode Path Extra 
Field._

> Regression - Corrupted headers when using 64 bit ZipArchiveOutputStream
> ---
>
> Key: COMPRESS-565
> URL: https://issues.apache.org/jira/browse/COMPRESS-565
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Affects Versions: 1.20
>Reporter: Evgenii Bovykin
>Assignee: Peter Lee
>Priority: Major
> Attachments: image-2021-02-20-15-51-21-747.png
>
>
> We've recently updated commons-compress library from version 1.9 to 1.20 and 
> now experiencing the problem that didn't occur before.
>  
> When using ZipArchiveOutputStream to archive 5Gb file and setting the 
> following fields
> {{output.setUseZip64(Zip64Mode.Always)}}
>  
> {{output.setCreateUnicodeExtraFields(ZipArchiveOutputStream.UnicodeExtraFieldPolicy.ALWAYS)}}
> resulting archive contains corrupted headers.
> *Expand-Archive Powershell utility cannot extract the archive at all with the 
> error about corrupted header. 7zip also complains about it, but can extract 
> the archive.*
>  
> The problem didn't appear when using library version 1.9.
>  
> I've created a sample project that reproduces the error - 
> [https://github.com/missingdays/commons-compress-example]
> Issue doesn't reproduce if you do any of the following:
>  
>  # Downgrade library to version 1.9
>  # Remove 
> output.setCreateUnicodeExtraFields(ZipArchiveOutputStream.UnicodeExtraFieldPolicy.ALWAYS)
>  # Remove output.setUseZip64(Zip64Mode.Always) and zip smaller file (e.g. 1Gb)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (COMPRESS-565) Regression - Corrupted headers when using 64 bit ZipArchiveOutputStream

2021-02-19 Thread Peter Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/COMPRESS-565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Lee updated COMPRESS-565:
---
Attachment: image-2021-02-20-15-51-21-747.png

> Regression - Corrupted headers when using 64 bit ZipArchiveOutputStream
> ---
>
> Key: COMPRESS-565
> URL: https://issues.apache.org/jira/browse/COMPRESS-565
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Affects Versions: 1.20
>Reporter: Evgenii Bovykin
>Assignee: Peter Lee
>Priority: Major
> Attachments: image-2021-02-20-15-51-21-747.png
>
>
> We've recently updated commons-compress library from version 1.9 to 1.20 and 
> now experiencing the problem that didn't occur before.
>  
> When using ZipArchiveOutputStream to archive 5Gb file and setting the 
> following fields
> {{output.setUseZip64(Zip64Mode.Always)}}
>  
> {{output.setCreateUnicodeExtraFields(ZipArchiveOutputStream.UnicodeExtraFieldPolicy.ALWAYS)}}
> resulting archive contains corrupted headers.
> *Expand-Archive Powershell utility cannot extract the archive at all with the 
> error about corrupted header. 7zip also complains about it, but can extract 
> the archive.*
>  
> The problem didn't appear when using library version 1.9.
>  
> I've created a sample project that reproduces the error - 
> [https://github.com/missingdays/commons-compress-example]
> Issue doesn't reproduce if you do any of the following:
>  
>  # Downgrade library to version 1.9
>  # Remove 
> output.setCreateUnicodeExtraFields(ZipArchiveOutputStream.UnicodeExtraFieldPolicy.ALWAYS)
>  # Remove output.setUseZip64(Zip64Mode.Always) and zip smaller file (e.g. 1Gb)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-565) Regression - Corrupted headers when using 64 bit ZipArchiveOutputStream

2021-02-19 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17287572#comment-17287572
 ] 

Peter Lee commented on COMPRESS-565:


I see. I can reproduce this issue now.

> Regression - Corrupted headers when using 64 bit ZipArchiveOutputStream
> ---
>
> Key: COMPRESS-565
> URL: https://issues.apache.org/jira/browse/COMPRESS-565
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Affects Versions: 1.20
>Reporter: Evgenii Bovykin
>Assignee: Peter Lee
>Priority: Major
>
> We've recently updated commons-compress library from version 1.9 to 1.20 and 
> now experiencing the problem that didn't occur before.
>  
> When using ZipArchiveOutputStream to archive 5Gb file and setting the 
> following fields
> {{output.setUseZip64(Zip64Mode.Always)}}
>  
> {{output.setCreateUnicodeExtraFields(ZipArchiveOutputStream.UnicodeExtraFieldPolicy.ALWAYS)}}
> resulting archive contains corrupted headers.
> *Expand-Archive Powershell utility cannot extract the archive at all with the 
> error about corrupted header. 7zip also complains about it, but can extract 
> the archive.*
>  
> The problem didn't appear when using library version 1.9.
>  
> I've created a sample project that reproduces the error - 
> [https://github.com/missingdays/commons-compress-example]
> Issue doesn't reproduce if you do any of the following:
>  
>  # Downgrade library to version 1.9
>  # Remove 
> output.setCreateUnicodeExtraFields(ZipArchiveOutputStream.UnicodeExtraFieldPolicy.ALWAYS)
>  # Remove output.setUseZip64(Zip64Mode.Always) and zip smaller file (e.g. 1Gb)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-565) Regression - Corrupted headers when using 64 bit ZipArchiveOutputStream

2021-02-18 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17286905#comment-17286905
 ] 

Peter Lee commented on COMPRESS-565:


I also tested in windows using 7zip 19.00, and it successfully extracted the 
_output.zip_ without reporting any errors.

> Regression - Corrupted headers when using 64 bit ZipArchiveOutputStream
> ---
>
> Key: COMPRESS-565
> URL: https://issues.apache.org/jira/browse/COMPRESS-565
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Affects Versions: 1.20
>Reporter: Evgenii Bovykin
>Assignee: Peter Lee
>Priority: Major
>
> We've recently updated commons-compress library from version 1.9 to 1.20 and 
> now experiencing the problem that didn't occur before.
>  
> When using ZipArchiveOutputStream to archive 5Gb file and setting the 
> following fields
> {{output.setUseZip64(Zip64Mode.Always)}}
>  
> {{output.setCreateUnicodeExtraFields(ZipArchiveOutputStream.UnicodeExtraFieldPolicy.ALWAYS)}}
> resulting archive contains corrupted headers.
> *Expand-Archive Powershell utility cannot extract the archive at all with the 
> error about corrupted header. 7zip also complains about it, but can extract 
> the archive.*
>  
> The problem didn't appear when using library version 1.9.
>  
> I've created a sample project that reproduces the error - 
> [https://github.com/missingdays/commons-compress-example]
> Issue doesn't reproduce if you do any of the following:
>  
>  # Downgrade library to version 1.9
>  # Remove 
> output.setCreateUnicodeExtraFields(ZipArchiveOutputStream.UnicodeExtraFieldPolicy.ALWAYS)
>  # Remove output.setUseZip64(Zip64Mode.Always) and zip smaller file (e.g. 1Gb)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-565) Regression - Corrupted headers when using 64 bit ZipArchiveOutputStream

2021-02-18 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17286884#comment-17286884
 ] 

Peter Lee commented on COMPRESS-565:


And I'm testing with UnZip 6.00.

> Regression - Corrupted headers when using 64 bit ZipArchiveOutputStream
> ---
>
> Key: COMPRESS-565
> URL: https://issues.apache.org/jira/browse/COMPRESS-565
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Affects Versions: 1.20
>Reporter: Evgenii Bovykin
>Assignee: Peter Lee
>Priority: Major
>
> We've recently updated commons-compress library from version 1.9 to 1.20 and 
> now experiencing the problem that didn't occur before.
>  
> When using ZipArchiveOutputStream to archive 5Gb file and setting the 
> following fields
> {{output.setUseZip64(Zip64Mode.Always)}}
>  
> {{output.setCreateUnicodeExtraFields(ZipArchiveOutputStream.UnicodeExtraFieldPolicy.ALWAYS)}}
> resulting archive contains corrupted headers.
> *Expand-Archive Powershell utility cannot extract the archive at all with the 
> error about corrupted header. 7zip also complains about it, but can extract 
> the archive.*
>  
> The problem didn't appear when using library version 1.9.
>  
> I've created a sample project that reproduces the error - 
> [https://github.com/missingdays/commons-compress-example]
> Issue doesn't reproduce if you do any of the following:
>  
>  # Downgrade library to version 1.9
>  # Remove 
> output.setCreateUnicodeExtraFields(ZipArchiveOutputStream.UnicodeExtraFieldPolicy.ALWAYS)
>  # Remove output.setUseZip64(Zip64Mode.Always) and zip smaller file (e.g. 1Gb)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-565) Regression - Corrupted headers when using 64 bit ZipArchiveOutputStream

2021-02-18 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17286882#comment-17286882
 ] 

Peter Lee commented on COMPRESS-565:


I'm testing like this:

 
{code:java}
RandomAccessFile randomAccessFile = new RandomAccessFile("archive/input.bin", 
"rw");
randomAccessFile.setLength(1024L * 1024 * 1024 * 5);

File outputFile = new File("archive/output.zip");
outputFile.createNewFile();
try(ZipArchiveOutputStream zipArchiveOutputStream = new 
ZipArchiveOutputStream(new BufferedOutputStream(new 
FileOutputStream(outputFile {
zipArchiveOutputStream.setUseZip64(Zip64Mode.Always);

zipArchiveOutputStream.setCreateUnicodeExtraFields(ZipArchiveOutputStream.UnicodeExtraFieldPolicy.ALWAYS);

zipArchiveOutputStream.putArchiveEntry(new ZipArchiveEntry("input.bin"));

InputStream inputStream = new FileInputStream("archive/input.bin");
IOUtils.copy(inputStream, zipArchiveOutputStream);

zipArchiveOutputStream.closeArchiveEntry();
}
{code}
And I think this test is exactly the same as yours.

 

I'm testing on Ubuntu and I can successfully extract the generated output.zip 
using _unzip_ command without reporting any errors.

 

> Regression - Corrupted headers when using 64 bit ZipArchiveOutputStream
> ---
>
> Key: COMPRESS-565
> URL: https://issues.apache.org/jira/browse/COMPRESS-565
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Affects Versions: 1.20
>Reporter: Evgenii Bovykin
>Assignee: Peter Lee
>Priority: Major
>
> We've recently updated commons-compress library from version 1.9 to 1.20 and 
> now experiencing the problem that didn't occur before.
>  
> When using ZipArchiveOutputStream to archive 5Gb file and setting the 
> following fields
> {{output.setUseZip64(Zip64Mode.Always)}}
>  
> {{output.setCreateUnicodeExtraFields(ZipArchiveOutputStream.UnicodeExtraFieldPolicy.ALWAYS)}}
> resulting archive contains corrupted headers.
> *Expand-Archive Powershell utility cannot extract the archive at all with the 
> error about corrupted header. 7zip also complains about it, but can extract 
> the archive.*
>  
> The problem didn't appear when using library version 1.9.
>  
> I've created a sample project that reproduces the error - 
> [https://github.com/missingdays/commons-compress-example]
> Issue doesn't reproduce if you do any of the following:
>  
>  # Downgrade library to version 1.9
>  # Remove 
> output.setCreateUnicodeExtraFields(ZipArchiveOutputStream.UnicodeExtraFieldPolicy.ALWAYS)
>  # Remove output.setUseZip64(Zip64Mode.Always) and zip smaller file (e.g. 1Gb)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-565) Regression - Corrupted headers when using 64 bit ZipArchiveOutputStream

2021-02-18 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17286828#comment-17286828
 ] 

Peter Lee commented on COMPRESS-565:


Not familiar with gradle and kotlin. :(

Is the input.bin an empty file with size of 5GB?

Or could you provide an example with Java using maven?

> Regression - Corrupted headers when using 64 bit ZipArchiveOutputStream
> ---
>
> Key: COMPRESS-565
> URL: https://issues.apache.org/jira/browse/COMPRESS-565
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Affects Versions: 1.20
>Reporter: Evgenii Bovykin
>Assignee: Peter Lee
>Priority: Major
>
> We've recently updated commons-compress library from version 1.9 to 1.20 and 
> now experiencing the problem that didn't occur before.
>  
> When using ZipArchiveOutputStream to archive 5Gb file and setting the 
> following fields
> {{output.setUseZip64(Zip64Mode.Always)}}
>  
> {{output.setCreateUnicodeExtraFields(ZipArchiveOutputStream.UnicodeExtraFieldPolicy.ALWAYS)}}
> resulting archive contains corrupted headers.
> *Expand-Archive Powershell utility cannot extract the archive at all with the 
> error about corrupted header. 7zip also complains about it, but can extract 
> the archive.*
>  
> The problem didn't appear when using library version 1.9.
>  
> I've created a sample project that reproduces the error - 
> [https://github.com/missingdays/commons-compress-example]
> Issue doesn't reproduce if you do any of the following:
>  
>  # Downgrade library to version 1.9
>  # Remove 
> output.setCreateUnicodeExtraFields(ZipArchiveOutputStream.UnicodeExtraFieldPolicy.ALWAYS)
>  # Remove output.setUseZip64(Zip64Mode.Always) and zip smaller file (e.g. 1Gb)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-565) Regression - Corrupted headers when using 64 bit ZipArchiveOutputStream

2021-02-16 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17285685#comment-17285685
 ] 

Peter Lee commented on COMPRESS-565:


Hi, Evgenii. Thank you for reporting this.

Could you reproduce this issue using the latest version of Compress?

And I didn't find a 'archive/input.bin' in your GH project. Is there any way 
you can share it(if this file doesn't contain anything important)?

> Regression - Corrupted headers when using 64 bit ZipArchiveOutputStream
> ---
>
> Key: COMPRESS-565
> URL: https://issues.apache.org/jira/browse/COMPRESS-565
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Affects Versions: 1.20
>Reporter: Evgenii Bovykin
>Assignee: Peter Lee
>Priority: Major
>
> We've recently updated commons-compress library from version 1.9 to 1.20 and 
> now experiencing the problem that didn't occur before.
>  
> When using ZipArchiveOutputStream to archive 5Gb file and setting the 
> following fields
> {{output.setUseZip64(Zip64Mode.Always)}}
>  
> {{output.setCreateUnicodeExtraFields(ZipArchiveOutputStream.UnicodeExtraFieldPolicy.ALWAYS)}}
> resulting archive contains corrupted headers.
> *Expand-Archive Powershell utility cannot extract the archive at all with the 
> error about corrupted header. 7zip also complains about it, but can extract 
> the archive.*
>  
> The problem didn't appear when using library version 1.9.
>  
> I've created a sample project that reproduces the error - 
> [https://github.com/missingdays/commons-compress-example]
> Issue doesn't reproduce if you do any of the following:
>  
>  # Downgrade library to version 1.9
>  # Remove 
> output.setCreateUnicodeExtraFields(ZipArchiveOutputStream.UnicodeExtraFieldPolicy.ALWAYS)
>  # Remove output.setUseZip64(Zip64Mode.Always) and zip smaller file (e.g. 1Gb)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (COMPRESS-565) Regression - Corrupted headers when using 64 bit ZipArchiveOutputStream

2021-02-16 Thread Peter Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/COMPRESS-565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Lee updated COMPRESS-565:
---
Assignee: Peter Lee

> Regression - Corrupted headers when using 64 bit ZipArchiveOutputStream
> ---
>
> Key: COMPRESS-565
> URL: https://issues.apache.org/jira/browse/COMPRESS-565
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Affects Versions: 1.20
>Reporter: Evgenii Bovykin
>Assignee: Peter Lee
>Priority: Major
>
> We've recently updated commons-compress library from version 1.9 to 1.20 and 
> now experiencing the problem that didn't occur before.
>  
> When using ZipArchiveOutputStream to archive 5Gb file and setting the 
> following fields
> {{output.setUseZip64(Zip64Mode.Always)}}
>  
> {{output.setCreateUnicodeExtraFields(ZipArchiveOutputStream.UnicodeExtraFieldPolicy.ALWAYS)}}
> resulting archive contains corrupted headers.
> *Expand-Archive Powershell utility cannot extract the archive at all with the 
> error about corrupted header. 7zip also complains about it, but can extract 
> the archive.*
>  
> The problem didn't appear when using library version 1.9.
>  
> I've created a sample project that reproduces the error - 
> [https://github.com/missingdays/commons-compress-example]
> Issue doesn't reproduce if you do any of the following:
>  
>  # Downgrade library to version 1.9
>  # Remove 
> output.setCreateUnicodeExtraFields(ZipArchiveOutputStream.UnicodeExtraFieldPolicy.ALWAYS)
>  # Remove output.setUseZip64(Zip64Mode.Always) and zip smaller file (e.g. 1Gb)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (COMPRESS-564) Support ZSTD JNI BufferPool

2021-02-09 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17282155#comment-17282155
 ] 

Peter Lee edited comment on COMPRESS-564 at 2/10/21, 1:42 AM:
--

Please have a look at [this 
PR|https://github.com/apache/commons-compress/pull/165].


was (Author: peterlee):
Please have a look at \{this 
PR|https://github.com/apache/commons-compress/pull/165}.

> Support ZSTD JNI BufferPool
> ---
>
> Key: COMPRESS-564
> URL: https://issues.apache.org/jira/browse/COMPRESS-564
> Project: Commons Compress
>  Issue Type: New Feature
>Reporter: Michael Heuer
>Priority: Major
>
> commons-compress should allow configuration of zstd-jni's use of 
> RecyclingBufferPool vs NoPool.  zstd-jni defaults to not pool buffers by 
> default
> [https://github.com/luben/zstd-jni/commit/f7c8279bc162c8c8b1964948d0f3b309ad715311]
>  
> Please see pull requests for similar issues in Apache Spark
> [https://github.com/apache/spark/pull/31453]
> and Apache Parquet projects
> [https://github.com/apache/parquet-mr/pull/865]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-564) Support ZSTD JNI BufferPool

2021-02-09 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17282155#comment-17282155
 ] 

Peter Lee commented on COMPRESS-564:


Please have a look at \{this 
PR|https://github.com/apache/commons-compress/pull/165}.

> Support ZSTD JNI BufferPool
> ---
>
> Key: COMPRESS-564
> URL: https://issues.apache.org/jira/browse/COMPRESS-564
> Project: Commons Compress
>  Issue Type: New Feature
>Reporter: Michael Heuer
>Priority: Major
>
> commons-compress should allow configuration of zstd-jni's use of 
> RecyclingBufferPool vs NoPool.  zstd-jni defaults to not pool buffers by 
> default
> [https://github.com/luben/zstd-jni/commit/f7c8279bc162c8c8b1964948d0f3b309ad715311]
>  
> Please see pull requests for similar issues in Apache Spark
> [https://github.com/apache/spark/pull/31453]
> and Apache Parquet projects
> [https://github.com/apache/parquet-mr/pull/865]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-564) Support ZSTD JNI BufferPool

2021-02-09 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17282144#comment-17282144
 ] 

Peter Lee commented on COMPRESS-564:


> I haven't seen other examples of runtime configuration in commons-compress 
>(although I haven't looked too closely yet)

We do not use runtime configuration like spark and Parquet did.

In this case, I think we can make BufferPool a parameter of constructor of 
ZstdCompressorInputStream.

> Support ZSTD JNI BufferPool
> ---
>
> Key: COMPRESS-564
> URL: https://issues.apache.org/jira/browse/COMPRESS-564
> Project: Commons Compress
>  Issue Type: New Feature
>Reporter: Michael Heuer
>Priority: Major
>
> commons-compress should allow configuration of zstd-jni's use of 
> RecyclingBufferPool vs NoPool.  zstd-jni defaults to not pool buffers by 
> default
> [https://github.com/luben/zstd-jni/commit/f7c8279bc162c8c8b1964948d0f3b309ad715311]
>  
> Please see pull requests for similar issues in Apache Spark
> [https://github.com/apache/spark/pull/31453]
> and Apache Parquet projects
> [https://github.com/apache/parquet-mr/pull/865]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-564) Support ZSTD JNI BufferPool

2021-02-09 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17282145#comment-17282145
 ] 

Peter Lee commented on COMPRESS-564:


> Note another possible configuration regarding zstd-jni, whether to use 
>implementations with finalizers

And I think we should make this a separated issue.:)

> Support ZSTD JNI BufferPool
> ---
>
> Key: COMPRESS-564
> URL: https://issues.apache.org/jira/browse/COMPRESS-564
> Project: Commons Compress
>  Issue Type: New Feature
>Reporter: Michael Heuer
>Priority: Major
>
> commons-compress should allow configuration of zstd-jni's use of 
> RecyclingBufferPool vs NoPool.  zstd-jni defaults to not pool buffers by 
> default
> [https://github.com/luben/zstd-jni/commit/f7c8279bc162c8c8b1964948d0f3b309ad715311]
>  
> Please see pull requests for similar issues in Apache Spark
> [https://github.com/apache/spark/pull/31453]
> and Apache Parquet projects
> [https://github.com/apache/parquet-mr/pull/865]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-564) Support ZSTD JNI BufferPool

2021-02-09 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17281662#comment-17281662
 ] 

Peter Lee commented on COMPRESS-564:


And a PR is welcome :)

> Support ZSTD JNI BufferPool
> ---
>
> Key: COMPRESS-564
> URL: https://issues.apache.org/jira/browse/COMPRESS-564
> Project: Commons Compress
>  Issue Type: New Feature
>Reporter: Michael Heuer
>Priority: Major
>
> commons-compress should allow configuration of zstd-jni's use of 
> RecyclingBufferPool vs NoPool.  zstd-jni defaults to not pool buffers by 
> default
> [https://github.com/luben/zstd-jni/commit/f7c8279bc162c8c8b1964948d0f3b309ad715311]
>  
> Please see pull requests for similar issues in Apache Spark
> [https://github.com/apache/spark/pull/31453]
> and Apache Parquet projects
> [https://github.com/apache/parquet-mr/pull/865]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-564) Support ZSTD JNI BufferPool

2021-02-09 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17281661#comment-17281661
 ] 

Peter Lee commented on COMPRESS-564:


Hi, Michael. This looks good to me.

It seems zstd-jni offers RecyclingBufferPool and NoPool, and it supports custom 
BufferPool implemented by users.

I prefer providing only RecyclingBufferPool and NoPool to users, instead of 
custom BufferPool. But I don't like using a boolean like the ones did in Spark 
and Parquet - there may be more BufferPools in zstd-jni in the future and we 
can't ensure it's always a config with only 2 choices.

WDYT?

> Support ZSTD JNI BufferPool
> ---
>
> Key: COMPRESS-564
> URL: https://issues.apache.org/jira/browse/COMPRESS-564
> Project: Commons Compress
>  Issue Type: New Feature
>Reporter: Michael Heuer
>Priority: Major
>
> commons-compress should allow configuration of zstd-jni's use of 
> RecyclingBufferPool vs NoPool.  zstd-jni defaults to not pool buffers by 
> default
> [https://github.com/luben/zstd-jni/commit/f7c8279bc162c8c8b1964948d0f3b309ad715311]
>  
> Please see pull requests for similar issues in Apache Spark
> [https://github.com/apache/spark/pull/31453]
> and Apache Parquet projects
> [https://github.com/apache/parquet-mr/pull/865]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (COMPRESS-563) Add support for .zst format in ArchiveStreamFactory

2021-01-28 Thread Peter Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/COMPRESS-563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Lee closed COMPRESS-563.
--
Resolution: Won't Fix

> Add support for .zst format in ArchiveStreamFactory
> ---
>
> Key: COMPRESS-563
> URL: https://issues.apache.org/jira/browse/COMPRESS-563
> Project: Commons Compress
>  Issue Type: New Feature
>Reporter: Peter Lee
>Priority: Major
>
> Zstandard compressed files(.zst files) has specific file signature 
> 0xFD2FB528. They should be detected and supported in ArchiveStreamFactory.
>  
> See also:
> [Zstandard frames(RFC8478)|https://tools.ietf.org/html/rfc8478#section-3.1.1]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (COMPRESS-563) Add support for .zst format in ArchiveStreamFactory

2021-01-27 Thread Peter Lee (Jira)
Peter Lee created COMPRESS-563:
--

 Summary: Add support for .zst format in ArchiveStreamFactory
 Key: COMPRESS-563
 URL: https://issues.apache.org/jira/browse/COMPRESS-563
 Project: Commons Compress
  Issue Type: New Feature
Reporter: Peter Lee


Zstandard compressed files(.zst files) has specific file signature 0xFD2FB528. 
They should be detected and supported in ArchiveStreamFactory.

 

See also:

[Zstandard frames(RFC8478)|https://tools.ietf.org/html/rfc8478#section-3.1.1]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (COMPRESS-562) ZipArchiveInputStream fails with unexpected record signature while ZipInputStream from java.util.zip succeeds

2021-01-16 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17266536#comment-17266536
 ] 

Peter Lee edited comment on COMPRESS-562 at 1/16/21, 9:47 AM:
--

Disclaimer : not familiar with zpk

I checked the apk file(test-services-1.1.0.apk) and found something strange :

There are 237 bytes of zero before the actual apk signing block.

!apk.PNG!

These redundant bytes of zero broke the read of apk signing block - that's why 
we are throwing the unexpected record signature exception. And I can 
successfully read this apk file with these bytes removed.

Accoarding to the [apk signing block 
specification|https://source.android.com/security/apksigning/v2],  these bytes 
are not mentioned. Please feel free to tell me if they are reasonable.

 

In short words, I believe the apk file is corrupted and could not be 
successfully read using ZipArchiveInputStream(but can be read with ZipFile).

 

BTW : Why java standard zip(ZipInputStream) can successfully read this apk?

I check the code of ZipInputStream and found they didn't check if a Central 
Directory File or APK signing block is met. They simply return null if the 
signature is not the one of Local File Header. That's why they didn't report 
any exceptions.

See also : [ZipInputStream in 
OpenJDK|https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/zip/ZipInputStream.java#L284]


was (Author: peterlee):
Disclaimer : not familiar with zpk

I checked the apk file(test-services-1.1.0.apk) and found something strange :

There are 237 bytes of zero before the actual apk signing block.

!apk.PNG!

These redundant bytes of zero broke the read of apk signing block - that's why 
we are throwing the unexpected record signature exception. And I can 
successfully read this apk file with these bytes removed.

Accoarding to the [apk signing block 
specification|[https://source.android.com/security/apksigning/v2]],  these 
bytes are not mentioned. Please feel free to tell me if they are reasonable.

 

In short words, I believe the apk file is corrupted and could not be 
successfully read using ZipArchiveInputStream(but can be read with ZipFile).

 

BTW : Why java standard zip(ZipInputStream) can successfully read this apk?

I check the code of ZipInputStream and found they didn't check if a Central 
Directory File or APK signing block is met. They simply return null if the 
signature is not the one of Local File Header. That's why they didn't report 
any exceptions.

See also : [ZipInputStream in 
OpenJDK|https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/zip/ZipInputStream.java#L284]

> ZipArchiveInputStream fails with unexpected record signature while 
> ZipInputStream from java.util.zip succeeds
> -
>
> Key: COMPRESS-562
> URL: https://issues.apache.org/jira/browse/COMPRESS-562
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Affects Versions: 1.20
> Environment: Zip 3.0 (July 5th 2008), by Info-ZIP, Compiled with gcc 
> 4.2.1 Compatible Apple LLVM 10.0.1 (clang-1001.0.37.14) for Unix (Mac OS X) 
> on Feb 22 2019.
> osx 10.14.6, AdoptOpenJDK 11.0.7
>Reporter: Oleksii Khomchenko
>Priority: Major
> Attachments: apk.PNG, test-services-1.1.0.apk
>
>
> Thank you a lot for the library.
>  
> I recently encountered next issue:
> {code:java}
> Exception in thread "main" java.util.zip.ZipException: Unexpected record 
> signature: 0X0
> {code}
> is thrown when reading test-services-1.1.0.apk from 
> [https://maven.google.com/web/index.html?q=test-ser#androidx.test.services:test-services:1.1.0]
>  via commons-compress:1.20 while java.util.zip reads it without the exception.
>  
> {code:java}
> public class UnzipTestServicesSample {
> public static void main(String[] args) throws Exception {
> Path p = Paths.get("test-services-1.1.0.apk");
> System.out.println("\n=== java std zip ===\n");
> try (InputStream is = Files.newInputStream(p); ZipInputStream zis = 
> new ZipInputStream(is)) {
> ZipEntry entry;
> while ((entry = zis.getNextEntry()) != null) {
> System.out.println("entry: " + entry.getName());
> }
> }
> System.out.println("\n=== apache compress zip ===\n");
> try (InputStream is = Files.newInputStream(p); ArchiveInputStream ais 
> = new ZipArchiveInputStream(is)) {
> ArchiveEntry entry;
> while ((entry = ais.getNextEntry()) != null) {
> System.out.println("entry: " + entry.getName());
> }
> }
> }
> }{code}
>  
> zip -T says that archive is fine:
>  
> {code:java}
> $ zip -T

[jira] [Comment Edited] (COMPRESS-562) ZipArchiveInputStream fails with unexpected record signature while ZipInputStream from java.util.zip succeeds

2021-01-16 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17266536#comment-17266536
 ] 

Peter Lee edited comment on COMPRESS-562 at 1/16/21, 9:46 AM:
--

Disclaimer : not familiar with zpk

I checked the apk file(test-services-1.1.0.apk) and found something strange :

There are 237 bytes of zero before the actual apk signing block.

!apk.PNG!

These redundant bytes of zero broke the read of apk signing block - that's why 
we are throwing the unexpected record signature exception. And I can 
successfully read this apk file with these bytes removed.

Accoarding to the [apk signing block 
specification|[https://source.android.com/security/apksigning/v2]],  these 
bytes are not mentioned. Please feel free to tell me if they are reasonable.

 

In short words, I believe the apk file is corrupted and could not be 
successfully read using ZipArchiveInputStream(but can be read with ZipFile).

 

BTW : Why java standard zip(ZipInputStream) can successfully read this apk?

I check the code of ZipInputStream and found they didn't check if a Central 
Directory File or APK signing block is met. They simply return null if the 
signature is not the one of Local File Header. That's why they didn't report 
any exceptions.

See also : [ZipInputStream in 
OpenJDK|https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/zip/ZipInputStream.java#L284]


was (Author: peterlee):
Disclaimer : not familiar with zpk

I checked the apk file(test-services-1.1.0.apk) and found something strange :

There are 237 bytes of zero before the actual apk signing block.

!apk.PNG!

These redundant bytes of zero broke the read of apk signing block - that's why 
we are throwing the unexpected record signature exception. And I can 
successfully read this apk file with these bytes removed.

Accoarding to the [apk signing block 
specification|[https://source.android.com/security/apksigning/v2],] these bytes 
are not mentioned. Please feel free to tell me if they are reasonable.

 

In short words, I believe the apk file is corrupted and could not be 
successfully read using ZipArchiveInputStream(but can be read with ZipFile).

 

BTW : Why java standard zip(ZipInputStream) can successfully read this apk?

I check the code of ZipInputStream and found they didn't check if a Central 
Directory File or APK signing block is met. They simply return null if the 
signature is not the one of Local File Header. That's why they didn't report 
any exceptions.

See also : [ZipInputStream in 
OpenJDK|https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/zip/ZipInputStream.java#L284]

> ZipArchiveInputStream fails with unexpected record signature while 
> ZipInputStream from java.util.zip succeeds
> -
>
> Key: COMPRESS-562
> URL: https://issues.apache.org/jira/browse/COMPRESS-562
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Affects Versions: 1.20
> Environment: Zip 3.0 (July 5th 2008), by Info-ZIP, Compiled with gcc 
> 4.2.1 Compatible Apple LLVM 10.0.1 (clang-1001.0.37.14) for Unix (Mac OS X) 
> on Feb 22 2019.
> osx 10.14.6, AdoptOpenJDK 11.0.7
>Reporter: Oleksii Khomchenko
>Priority: Major
> Attachments: apk.PNG, test-services-1.1.0.apk
>
>
> Thank you a lot for the library.
>  
> I recently encountered next issue:
> {code:java}
> Exception in thread "main" java.util.zip.ZipException: Unexpected record 
> signature: 0X0
> {code}
> is thrown when reading test-services-1.1.0.apk from 
> [https://maven.google.com/web/index.html?q=test-ser#androidx.test.services:test-services:1.1.0]
>  via commons-compress:1.20 while java.util.zip reads it without the exception.
>  
> {code:java}
> public class UnzipTestServicesSample {
> public static void main(String[] args) throws Exception {
> Path p = Paths.get("test-services-1.1.0.apk");
> System.out.println("\n=== java std zip ===\n");
> try (InputStream is = Files.newInputStream(p); ZipInputStream zis = 
> new ZipInputStream(is)) {
> ZipEntry entry;
> while ((entry = zis.getNextEntry()) != null) {
> System.out.println("entry: " + entry.getName());
> }
> }
> System.out.println("\n=== apache compress zip ===\n");
> try (InputStream is = Files.newInputStream(p); ArchiveInputStream ais 
> = new ZipArchiveInputStream(is)) {
> ArchiveEntry entry;
> while ((entry = ais.getNextEntry()) != null) {
> System.out.println("entry: " + entry.getName());
> }
> }
> }
> }{code}
>  
> zip -T says that archive is fine:
>  
> {code:java}
> $ zip -

[jira] [Comment Edited] (COMPRESS-562) ZipArchiveInputStream fails with unexpected record signature while ZipInputStream from java.util.zip succeeds

2021-01-16 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17266536#comment-17266536
 ] 

Peter Lee edited comment on COMPRESS-562 at 1/16/21, 9:45 AM:
--

Disclaimer : not familiar with zpk

I checked the apk file(test-services-1.1.0.apk) and found something strange :

There are 237 bytes of zero before the actual apk signing block.

!apk.PNG!

These redundant bytes of zero broke the read of apk signing block - that's why 
we are throwing the unexpected record signature exception. And I can 
successfully read this apk file with these bytes removed.

Accoarding to the [apk signing block 
specification|[https://source.android.com/security/apksigning/v2],] these bytes 
are not mentioned. Please feel free to tell me if they are reasonable.

 

In short words, I believe the apk file is corrupted and could not be 
successfully read using ZipArchiveInputStream(but can be read with ZipFile).

 

BTW : Why java standard zip(ZipInputStream) can successfully read this apk?

I check the code of ZipInputStream and found they didn't check if a Central 
Directory File or APK signing block is met. They simply return null if the 
signature is not the one of Local File Header. That's why they didn't report 
any exceptions.

See also : [ZipInputStream in 
OpenJDK|https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/zip/ZipInputStream.java#L284]


was (Author: peterlee):
Disclaimer : not familiar with zpk

I checked the apk file(test-services-1.1.0.apk) and found something strange :

There are 237 bytes of zero before the actual apk signing block.

!apk.PNG!

These redundant bytes of zero broke the read of apk signing block - that's why 
we are throwing the unexpected record signature exception. And I can 
successfully read this apk file with these bytes removed.

Accoarding to the [apk signing block 
specification|[https://source.android.com/security/apksigning/v2] 
,|https://source.android.com/security/apksigning/v2],]these bytes are not 
mentioned. Please feel free to tell me if they are reasonable.

 

In short words, I believe the apk file is corrupted and could not be 
successfully read using ZipArchiveInputStream(but can be read with ZipFile).

 

BTW : Why java standard zip(ZipInputStream) can successfully read this apk?

I check the code of ZipInputStream and found they didn't check if a Central 
Directory File or APK signing block is met. They simply return null if the 
signature is not the one of Local File Header. That's why they didn't report 
any exceptions.

See also : [ZipInputStream in 
OpenJDK|https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/zip/ZipInputStream.java#L284]

> ZipArchiveInputStream fails with unexpected record signature while 
> ZipInputStream from java.util.zip succeeds
> -
>
> Key: COMPRESS-562
> URL: https://issues.apache.org/jira/browse/COMPRESS-562
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Affects Versions: 1.20
> Environment: Zip 3.0 (July 5th 2008), by Info-ZIP, Compiled with gcc 
> 4.2.1 Compatible Apple LLVM 10.0.1 (clang-1001.0.37.14) for Unix (Mac OS X) 
> on Feb 22 2019.
> osx 10.14.6, AdoptOpenJDK 11.0.7
>Reporter: Oleksii Khomchenko
>Priority: Major
> Attachments: apk.PNG, test-services-1.1.0.apk
>
>
> Thank you a lot for the library.
>  
> I recently encountered next issue:
> {code:java}
> Exception in thread "main" java.util.zip.ZipException: Unexpected record 
> signature: 0X0
> {code}
> is thrown when reading test-services-1.1.0.apk from 
> [https://maven.google.com/web/index.html?q=test-ser#androidx.test.services:test-services:1.1.0]
>  via commons-compress:1.20 while java.util.zip reads it without the exception.
>  
> {code:java}
> public class UnzipTestServicesSample {
> public static void main(String[] args) throws Exception {
> Path p = Paths.get("test-services-1.1.0.apk");
> System.out.println("\n=== java std zip ===\n");
> try (InputStream is = Files.newInputStream(p); ZipInputStream zis = 
> new ZipInputStream(is)) {
> ZipEntry entry;
> while ((entry = zis.getNextEntry()) != null) {
> System.out.println("entry: " + entry.getName());
> }
> }
> System.out.println("\n=== apache compress zip ===\n");
> try (InputStream is = Files.newInputStream(p); ArchiveInputStream ais 
> = new ZipArchiveInputStream(is)) {
> ArchiveEntry entry;
> while ((entry = ais.getNextEntry()) != null) {
> System.out.println("entry: " + entry.getName());
> }
> }
> }
> }{code}
>  
> zip -T sa

[jira] [Commented] (COMPRESS-562) ZipArchiveInputStream fails with unexpected record signature while ZipInputStream from java.util.zip succeeds

2021-01-16 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17266538#comment-17266538
 ] 

Peter Lee commented on COMPRESS-562:


The attached test-services-1.1.0.apk I upload is the one I mentioned - I 
removed the redundant bytes of zero and it could be successfully read with 
ZipArchiveInputStream.

> ZipArchiveInputStream fails with unexpected record signature while 
> ZipInputStream from java.util.zip succeeds
> -
>
> Key: COMPRESS-562
> URL: https://issues.apache.org/jira/browse/COMPRESS-562
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Affects Versions: 1.20
> Environment: Zip 3.0 (July 5th 2008), by Info-ZIP, Compiled with gcc 
> 4.2.1 Compatible Apple LLVM 10.0.1 (clang-1001.0.37.14) for Unix (Mac OS X) 
> on Feb 22 2019.
> osx 10.14.6, AdoptOpenJDK 11.0.7
>Reporter: Oleksii Khomchenko
>Priority: Major
> Attachments: apk.PNG, test-services-1.1.0.apk
>
>
> Thank you a lot for the library.
>  
> I recently encountered next issue:
> {code:java}
> Exception in thread "main" java.util.zip.ZipException: Unexpected record 
> signature: 0X0
> {code}
> is thrown when reading test-services-1.1.0.apk from 
> [https://maven.google.com/web/index.html?q=test-ser#androidx.test.services:test-services:1.1.0]
>  via commons-compress:1.20 while java.util.zip reads it without the exception.
>  
> {code:java}
> public class UnzipTestServicesSample {
> public static void main(String[] args) throws Exception {
> Path p = Paths.get("test-services-1.1.0.apk");
> System.out.println("\n=== java std zip ===\n");
> try (InputStream is = Files.newInputStream(p); ZipInputStream zis = 
> new ZipInputStream(is)) {
> ZipEntry entry;
> while ((entry = zis.getNextEntry()) != null) {
> System.out.println("entry: " + entry.getName());
> }
> }
> System.out.println("\n=== apache compress zip ===\n");
> try (InputStream is = Files.newInputStream(p); ArchiveInputStream ais 
> = new ZipArchiveInputStream(is)) {
> ArchiveEntry entry;
> while ((entry = ais.getNextEntry()) != null) {
> System.out.println("entry: " + entry.getName());
> }
> }
> }
> }{code}
>  
> zip -T says that archive is fine:
>  
> {code:java}
> $ zip -T test-services-1.1.0.apk 
> test of test-services-1.1.0.apk OK{code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (COMPRESS-562) ZipArchiveInputStream fails with unexpected record signature while ZipInputStream from java.util.zip succeeds

2021-01-16 Thread Peter Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/COMPRESS-562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Lee updated COMPRESS-562:
---
Attachment: test-services-1.1.0.apk

> ZipArchiveInputStream fails with unexpected record signature while 
> ZipInputStream from java.util.zip succeeds
> -
>
> Key: COMPRESS-562
> URL: https://issues.apache.org/jira/browse/COMPRESS-562
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Affects Versions: 1.20
> Environment: Zip 3.0 (July 5th 2008), by Info-ZIP, Compiled with gcc 
> 4.2.1 Compatible Apple LLVM 10.0.1 (clang-1001.0.37.14) for Unix (Mac OS X) 
> on Feb 22 2019.
> osx 10.14.6, AdoptOpenJDK 11.0.7
>Reporter: Oleksii Khomchenko
>Priority: Major
> Attachments: apk.PNG, test-services-1.1.0.apk
>
>
> Thank you a lot for the library.
>  
> I recently encountered next issue:
> {code:java}
> Exception in thread "main" java.util.zip.ZipException: Unexpected record 
> signature: 0X0
> {code}
> is thrown when reading test-services-1.1.0.apk from 
> [https://maven.google.com/web/index.html?q=test-ser#androidx.test.services:test-services:1.1.0]
>  via commons-compress:1.20 while java.util.zip reads it without the exception.
>  
> {code:java}
> public class UnzipTestServicesSample {
> public static void main(String[] args) throws Exception {
> Path p = Paths.get("test-services-1.1.0.apk");
> System.out.println("\n=== java std zip ===\n");
> try (InputStream is = Files.newInputStream(p); ZipInputStream zis = 
> new ZipInputStream(is)) {
> ZipEntry entry;
> while ((entry = zis.getNextEntry()) != null) {
> System.out.println("entry: " + entry.getName());
> }
> }
> System.out.println("\n=== apache compress zip ===\n");
> try (InputStream is = Files.newInputStream(p); ArchiveInputStream ais 
> = new ZipArchiveInputStream(is)) {
> ArchiveEntry entry;
> while ((entry = ais.getNextEntry()) != null) {
> System.out.println("entry: " + entry.getName());
> }
> }
> }
> }{code}
>  
> zip -T says that archive is fine:
>  
> {code:java}
> $ zip -T test-services-1.1.0.apk 
> test of test-services-1.1.0.apk OK{code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-562) ZipArchiveInputStream fails with unexpected record signature while ZipInputStream from java.util.zip succeeds

2021-01-16 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17266536#comment-17266536
 ] 

Peter Lee commented on COMPRESS-562:


Disclaimer : not familiar with zpk

I checked the apk file(test-services-1.1.0.apk) and found something strange :

There are 237 bytes of zero before the actual apk signing block.

!apk.PNG!

These redundant bytes of zero broke the read of apk signing block - that's why 
we are throwing the unexpected record signature exception. And I can 
successfully read this apk file with these bytes removed.

Accoarding to the [apk signing block 
specification|[https://source.android.com/security/apksigning/v2] 
,|https://source.android.com/security/apksigning/v2],]these bytes are not 
mentioned. Please feel free to tell me if they are reasonable.

 

In short words, I believe the apk file is corrupted and could not be 
successfully read using ZipArchiveInputStream(but can be read with ZipFile).

 

BTW : Why java standard zip(ZipInputStream) can successfully read this apk?

I check the code of ZipInputStream and found they didn't check if a Central 
Directory File or APK signing block is met. They simply return null if the 
signature is not the one of Local File Header. That's why they didn't report 
any exceptions.

See also : [ZipInputStream in 
OpenJDK|https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/zip/ZipInputStream.java#L284]

> ZipArchiveInputStream fails with unexpected record signature while 
> ZipInputStream from java.util.zip succeeds
> -
>
> Key: COMPRESS-562
> URL: https://issues.apache.org/jira/browse/COMPRESS-562
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Affects Versions: 1.20
> Environment: Zip 3.0 (July 5th 2008), by Info-ZIP, Compiled with gcc 
> 4.2.1 Compatible Apple LLVM 10.0.1 (clang-1001.0.37.14) for Unix (Mac OS X) 
> on Feb 22 2019.
> osx 10.14.6, AdoptOpenJDK 11.0.7
>Reporter: Oleksii Khomchenko
>Priority: Major
> Attachments: apk.PNG
>
>
> Thank you a lot for the library.
>  
> I recently encountered next issue:
> {code:java}
> Exception in thread "main" java.util.zip.ZipException: Unexpected record 
> signature: 0X0
> {code}
> is thrown when reading test-services-1.1.0.apk from 
> [https://maven.google.com/web/index.html?q=test-ser#androidx.test.services:test-services:1.1.0]
>  via commons-compress:1.20 while java.util.zip reads it without the exception.
>  
> {code:java}
> public class UnzipTestServicesSample {
> public static void main(String[] args) throws Exception {
> Path p = Paths.get("test-services-1.1.0.apk");
> System.out.println("\n=== java std zip ===\n");
> try (InputStream is = Files.newInputStream(p); ZipInputStream zis = 
> new ZipInputStream(is)) {
> ZipEntry entry;
> while ((entry = zis.getNextEntry()) != null) {
> System.out.println("entry: " + entry.getName());
> }
> }
> System.out.println("\n=== apache compress zip ===\n");
> try (InputStream is = Files.newInputStream(p); ArchiveInputStream ais 
> = new ZipArchiveInputStream(is)) {
> ArchiveEntry entry;
> while ((entry = ais.getNextEntry()) != null) {
> System.out.println("entry: " + entry.getName());
> }
> }
> }
> }{code}
>  
> zip -T says that archive is fine:
>  
> {code:java}
> $ zip -T test-services-1.1.0.apk 
> test of test-services-1.1.0.apk OK{code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (COMPRESS-562) ZipArchiveInputStream fails with unexpected record signature while ZipInputStream from java.util.zip succeeds

2021-01-16 Thread Peter Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/COMPRESS-562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Lee updated COMPRESS-562:
---
Attachment: apk.PNG

> ZipArchiveInputStream fails with unexpected record signature while 
> ZipInputStream from java.util.zip succeeds
> -
>
> Key: COMPRESS-562
> URL: https://issues.apache.org/jira/browse/COMPRESS-562
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Affects Versions: 1.20
> Environment: Zip 3.0 (July 5th 2008), by Info-ZIP, Compiled with gcc 
> 4.2.1 Compatible Apple LLVM 10.0.1 (clang-1001.0.37.14) for Unix (Mac OS X) 
> on Feb 22 2019.
> osx 10.14.6, AdoptOpenJDK 11.0.7
>Reporter: Oleksii Khomchenko
>Priority: Major
> Attachments: apk.PNG
>
>
> Thank you a lot for the library.
>  
> I recently encountered next issue:
> {code:java}
> Exception in thread "main" java.util.zip.ZipException: Unexpected record 
> signature: 0X0
> {code}
> is thrown when reading test-services-1.1.0.apk from 
> [https://maven.google.com/web/index.html?q=test-ser#androidx.test.services:test-services:1.1.0]
>  via commons-compress:1.20 while java.util.zip reads it without the exception.
>  
> {code:java}
> public class UnzipTestServicesSample {
> public static void main(String[] args) throws Exception {
> Path p = Paths.get("test-services-1.1.0.apk");
> System.out.println("\n=== java std zip ===\n");
> try (InputStream is = Files.newInputStream(p); ZipInputStream zis = 
> new ZipInputStream(is)) {
> ZipEntry entry;
> while ((entry = zis.getNextEntry()) != null) {
> System.out.println("entry: " + entry.getName());
> }
> }
> System.out.println("\n=== apache compress zip ===\n");
> try (InputStream is = Files.newInputStream(p); ArchiveInputStream ais 
> = new ZipArchiveInputStream(is)) {
> ArchiveEntry entry;
> while ((entry = ais.getNextEntry()) != null) {
> System.out.println("entry: " + entry.getName());
> }
> }
> }
> }{code}
>  
> zip -T says that archive is fine:
>  
> {code:java}
> $ zip -T test-services-1.1.0.apk 
> test of test-services-1.1.0.apk OK{code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-562) ZipArchiveInputStream fails with unexpected record signature while ZipInputStream from java.util.zip succeeds

2021-01-16 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17266534#comment-17266534
 ] 

Peter Lee commented on COMPRESS-562:


I tested to read this .apk with ZipFile and can be successfully read - maybe 
you can try to read it with ZipFile instead of ZipArchiveInputStream.

> ZipArchiveInputStream fails with unexpected record signature while 
> ZipInputStream from java.util.zip succeeds
> -
>
> Key: COMPRESS-562
> URL: https://issues.apache.org/jira/browse/COMPRESS-562
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Affects Versions: 1.20
> Environment: Zip 3.0 (July 5th 2008), by Info-ZIP, Compiled with gcc 
> 4.2.1 Compatible Apple LLVM 10.0.1 (clang-1001.0.37.14) for Unix (Mac OS X) 
> on Feb 22 2019.
> osx 10.14.6, AdoptOpenJDK 11.0.7
>Reporter: Oleksii Khomchenko
>Priority: Major
>
> Thank you a lot for the library.
>  
> I recently encountered next issue:
> {code:java}
> Exception in thread "main" java.util.zip.ZipException: Unexpected record 
> signature: 0X0
> {code}
> is thrown when reading test-services-1.1.0.apk from 
> [https://maven.google.com/web/index.html?q=test-ser#androidx.test.services:test-services:1.1.0]
>  via commons-compress:1.20 while java.util.zip reads it without the exception.
>  
> {code:java}
> public class UnzipTestServicesSample {
> public static void main(String[] args) throws Exception {
> Path p = Paths.get("test-services-1.1.0.apk");
> System.out.println("\n=== java std zip ===\n");
> try (InputStream is = Files.newInputStream(p); ZipInputStream zis = 
> new ZipInputStream(is)) {
> ZipEntry entry;
> while ((entry = zis.getNextEntry()) != null) {
> System.out.println("entry: " + entry.getName());
> }
> }
> System.out.println("\n=== apache compress zip ===\n");
> try (InputStream is = Files.newInputStream(p); ArchiveInputStream ais 
> = new ZipArchiveInputStream(is)) {
> ArchiveEntry entry;
> while ((entry = ais.getNextEntry()) != null) {
> System.out.println("entry: " + entry.getName());
> }
> }
> }
> }{code}
>  
> zip -T says that archive is fine:
>  
> {code:java}
> $ zip -T test-services-1.1.0.apk 
> test of test-services-1.1.0.apk OK{code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-562) ZipArchiveInputStream fails with unexpected record signature while ZipInputStream from java.util.zip succeeds

2021-01-16 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17266530#comment-17266530
 ] 

Peter Lee commented on COMPRESS-562:


This is associated with COMPRESS-455 and COMPRESS-461.

I'm trying to find out the problem. Considering I'm not familiar with APK 
specification, a patch or PR is always welcome. :)

> ZipArchiveInputStream fails with unexpected record signature while 
> ZipInputStream from java.util.zip succeeds
> -
>
> Key: COMPRESS-562
> URL: https://issues.apache.org/jira/browse/COMPRESS-562
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Affects Versions: 1.20
> Environment: Zip 3.0 (July 5th 2008), by Info-ZIP, Compiled with gcc 
> 4.2.1 Compatible Apple LLVM 10.0.1 (clang-1001.0.37.14) for Unix (Mac OS X) 
> on Feb 22 2019.
> osx 10.14.6, AdoptOpenJDK 11.0.7
>Reporter: Oleksii Khomchenko
>Priority: Major
>
> Thank you a lot for the library.
>  
> I recently encountered next issue:
> {code:java}
> Exception in thread "main" java.util.zip.ZipException: Unexpected record 
> signature: 0X0
> {code}
> is thrown when reading test-services-1.1.0.apk from 
> [https://maven.google.com/web/index.html?q=test-ser#androidx.test.services:test-services:1.1.0]
>  via commons-compress:1.20 while java.util.zip reads it without the exception.
>  
> {code:java}
> public class UnzipTestServicesSample {
> public static void main(String[] args) throws Exception {
> Path p = Paths.get("test-services-1.1.0.apk");
> System.out.println("\n=== java std zip ===\n");
> try (InputStream is = Files.newInputStream(p); ZipInputStream zis = 
> new ZipInputStream(is)) {
> ZipEntry entry;
> while ((entry = zis.getNextEntry()) != null) {
> System.out.println("entry: " + entry.getName());
> }
> }
> System.out.println("\n=== apache compress zip ===\n");
> try (InputStream is = Files.newInputStream(p); ArchiveInputStream ais 
> = new ZipArchiveInputStream(is)) {
> ArchiveEntry entry;
> while ((entry = ais.getNextEntry()) != null) {
> System.out.println("entry: " + entry.getName());
> }
> }
> }
> }{code}
>  
> zip -T says that archive is fine:
>  
> {code:java}
> $ zip -T test-services-1.1.0.apk 
> test of test-services-1.1.0.apk OK{code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (COMPRESS-503) "open when actually needed" for MultiReadOnlySeekableByteChannel

2021-01-02 Thread Peter Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/COMPRESS-503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Lee closed COMPRESS-503.
--
Resolution: Duplicate

> "open when actually needed" for MultiReadOnlySeekableByteChannel
> 
>
> Key: COMPRESS-503
> URL: https://issues.apache.org/jira/browse/COMPRESS-503
> Project: Commons Compress
>  Issue Type: Improvement
>Affects Versions: 1.19, 1.20
>Reporter: Peter Alfred Lee
>Priority: Minor
> Fix For: 1.21
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When I was adding zip64 support for split zip, I encountered a problem :
> When adding testcases in {{Zip64SupportIT}}, I created a split zip with 
> 10,000+ split segments. Then I found that I was unable to unzip it because 
> there would be too many open files when extracting it. We can oepn the files 
> when actually needed and therefore we can successfully extract such split 
> zips with great amount of segments.
> I have set a threshold of 20 in {{MultiReadOnlySeekableByteChannel}}. The 
> "open when actually needed" procedure will only work when the number of split 
> segments is greater than the threshold.
>  
> This may be used in ZipArchiveInputStream and ZipFile cause 
> {{MultiReadOnlySeekableByteChannel}} is used in them.
>  
> Actually this is a pretty rare case cause most split zips would not have too 
> many segments. Just thinking about a split zip with 1,000+ segments - it must 
> be a nightmare. So I'm not sure if this is needed for 
> {{MultiReadOnlySeekableByteChannel.WDYT?}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (COMPRESS-559) SparseTarTest different on Linux and Windows

2020-12-26 Thread Peter Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/COMPRESS-559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Lee resolved COMPRESS-559.

Fix Version/s: 1.21
   Resolution: Fixed

> SparseTarTest different on Linux and Windows
> 
>
> Key: COMPRESS-559
> URL: https://issues.apache.org/jira/browse/COMPRESS-559
> Project: Commons Compress
>  Issue Type: Bug
>Reporter: Robin Schimpf
>Priority: Major
> Fix For: 1.21
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> The {{SparseFilesTest#testExtractPaxGNU}} currently ignores the content file 
> {{sparsefile-0.1}} only on Linux due to some problems on Ubuntu 16.04.
> The test for Windows currently extracts this content file.
> Extract the file also on Linux since the problem is no longer visible with 
> newer Ubuntu versions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (COMPRESS-561) Minor improvement

2020-12-26 Thread Peter Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/COMPRESS-561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Lee resolved COMPRESS-561.

Fix Version/s: 1.21
   Resolution: Fixed

> Minor improvement
> -
>
> Key: COMPRESS-561
> URL: https://issues.apache.org/jira/browse/COMPRESS-561
> Project: Commons Compress
>  Issue Type: Improvement
>Reporter: Arturo Bernal
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.21
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Minor improvement
>  
>  * Unused import
>  * add final
>  * Replace  Charset.forName("UTF-8")  for --> StandardCharsets.UTF_8 
>  * Unnecessary semicolon ''
>  *



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (COMPRESS-560) TarArchiveInputStream returns wrong value on canReadEntryData for sparse entries

2020-12-09 Thread Peter Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/COMPRESS-560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Lee resolved COMPRESS-560.

Fix Version/s: 1.21
   Resolution: Fixed

> TarArchiveInputStream returns wrong value on canReadEntryData for sparse 
> entries
> 
>
> Key: COMPRESS-560
> URL: https://issues.apache.org/jira/browse/COMPRESS-560
> Project: Commons Compress
>  Issue Type: Bug
>Affects Versions: 1.20
>Reporter: Robin Schimpf
>Priority: Major
> Fix For: 1.21
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Since sparse tar content files are now supported the result from 
> canReadEntryData is wrong since it still returns false if the entry is a 
> sparse file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (COMPRESS-560) TarArchiveInputStream returns wrong value on canReadEntryData for sparse entries

2020-11-26 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17239003#comment-17239003
 ] 

Peter Lee edited comment on COMPRESS-560 at 11/26/20, 9:15 AM:
---

Yes. Seems we can always read the entry now,-and we do not need a 
canReadEntryData here any more.-

-Since it's a public method and removing it may break the BC, maybe we can make 
it always return true and have a Deprecated annotation here. WDYT?-

 

Update : I forgot about the class ArchiveEntry and TarArchiveEntry. We only 
need to check whether the class is instance of TarArchiveEntry here.


was (Author: peterlee):
Yes. Seems we can always read the entry now, and we do not need a 
canReadEntryData here any more.

-Since it's a public method and removing it may break the BC, maybe we can make 
it always return true and have a Deprecated annotation here. WDYT?-

> TarArchiveInputStream returns wrong value on canReadEntryData for sparse 
> entries
> 
>
> Key: COMPRESS-560
> URL: https://issues.apache.org/jira/browse/COMPRESS-560
> Project: Commons Compress
>  Issue Type: Bug
>Affects Versions: 1.20
>Reporter: Robin Schimpf
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Since sparse tar content files are now supported the result from 
> canReadEntryData is wrong since it still returns false if the entry is a 
> sparse file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (COMPRESS-560) TarArchiveInputStream returns wrong value on canReadEntryData for sparse entries

2020-11-26 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17239003#comment-17239003
 ] 

Peter Lee edited comment on COMPRESS-560 at 11/26/20, 9:12 AM:
---

Yes. Seems we can always read the entry now, and we do not need a 
canReadEntryData here any more.

-Since it's a public method and removing it may break the BC, maybe we can make 
it always return true and have a Deprecated annotation here. WDYT?-


was (Author: peterlee):
Yes. Seems we can always read the entry now, and we do not need a 
canReadEntryData here any more.

Since it's a public method and removing it may break the BC, maybe we can make 
it always return true and have a Deprecated annotation here. WDYT?

> TarArchiveInputStream returns wrong value on canReadEntryData for sparse 
> entries
> 
>
> Key: COMPRESS-560
> URL: https://issues.apache.org/jira/browse/COMPRESS-560
> Project: Commons Compress
>  Issue Type: Bug
>Affects Versions: 1.20
>Reporter: Robin Schimpf
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Since sparse tar content files are now supported the result from 
> canReadEntryData is wrong since it still returns false if the entry is a 
> sparse file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-560) TarArchiveInputStream returns wrong value on canReadEntryData for sparse entries

2020-11-25 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17239003#comment-17239003
 ] 

Peter Lee commented on COMPRESS-560:


Yes. Seems we can always read the entry now, and we do not need a 
canReadEntryData here any more.

Since it's a public method and removing it may break the BC, maybe we can make 
it always return true and have a Deprecated annotation here. WDYT?

> TarArchiveInputStream returns wrong value on canReadEntryData for sparse 
> entries
> 
>
> Key: COMPRESS-560
> URL: https://issues.apache.org/jira/browse/COMPRESS-560
> Project: Commons Compress
>  Issue Type: Bug
>Affects Versions: 1.20
>Reporter: Robin Schimpf
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Since sparse tar content files are now supported the result from 
> canReadEntryData is wrong since it still returns false if the entry is a 
> sparse file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (COMPRESS-509) The ambiguous behavior of the TarArchiveEntry.getName() method

2020-11-20 Thread Peter Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/COMPRESS-509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Lee resolved COMPRESS-509.

Resolution: Fixed

> The ambiguous behavior of the TarArchiveEntry.getName() method
> --
>
> Key: COMPRESS-509
> URL: https://issues.apache.org/jira/browse/COMPRESS-509
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Affects Versions: 1.18, 1.20
>Reporter: Petr Vasak
>Assignee: Peter Lee
>Priority: Minor
> Fix For: 1.21
>
> Attachments: Main.java
>
>
> Scenario: To tar an empty directory and then to untar it. When the name is 
> longer than 100 characters, no ending slash appears.
> Example: see attachment
> Part of the output:
> ..
> dir/aa/
> dir/aaa/
> dir/
> dir/a
> ..



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (COMPRESS-558) Current master fails to extract ActiveMQ tar archive

2020-11-15 Thread Peter Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/COMPRESS-558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Lee resolved COMPRESS-558.

Fix Version/s: 1.21
   Resolution: Fixed

> Current master fails to extract ActiveMQ tar archive
> 
>
> Key: COMPRESS-558
> URL: https://issues.apache.org/jira/browse/COMPRESS-558
> Project: Commons Compress
>  Issue Type: Bug
>Affects Versions: 1.21
>Reporter: Robin Schimpf
>Priority: Major
> Fix For: 1.21
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> While version 1.20 is able to extract the ActiveMQ tar archive which can be 
> found here ([https://activemq.apache.org/components/classic/download/]) the 
> current master fails at extracting it with the following exception
> {code:java}
> java.io.IOException: Error detected parsing the header
>   at 
> org.apache.commons.compress.archivers.tar.TarArchiveInputStream.getNextTarEntry(TarArchiveInputStream.java:385)
> {code}
> I made a bisect search and this reveals the error is introduced with 
> COMPRESS-509. I made an analysis and found the following why the problem 
> appears
>  * Error ist thrown after reading a file in the tar but the file has a "/" at 
> the end
>  * In the currEntry.isGNULongNameEntry the currEntry.isDirectory evaluates to 
> true for this file -> reason why the "/" is appended
>  * The underlying problem is that isDirectory also checks if the entry has a 
> "/" at the end. But the decoded file name is not yet set in the currEntry. 
> Since the entry before the file is a folder the name ends with "/" and 
> isDirectory returns the wrong value
> Since I already came this far I'll send a pulll request where the decoded 
> name is set before checking if we need to append "/" to the entry name. With 
> this change the ActiveMQ tar can be extracted correctly again.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (COMPRESS-558) Current master fails to extract ActiveMQ tar archive

2020-11-13 Thread Peter Lee (Jira)


[ 
https://issues.apache.org/jira/browse/COMPRESS-558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17231450#comment-17231450
 ] 

Peter Lee commented on COMPRESS-558:


Hi [~rschimpf]. Great catch!

This problem is caused by the incomplete fix of COMPRESS-509 and this fix looks 
good to me.

Thank you for your contribution. :)

> Current master fails to extract ActiveMQ tar archive
> 
>
> Key: COMPRESS-558
> URL: https://issues.apache.org/jira/browse/COMPRESS-558
> Project: Commons Compress
>  Issue Type: Bug
>Affects Versions: 1.21
>Reporter: Robin Schimpf
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> While version 1.20 is able to extract the ActiveMQ tar archive which can be 
> found here ([https://activemq.apache.org/components/classic/download/]) the 
> current master fails at extracting it with the following exception
> {code:java}
> java.io.IOException: Error detected parsing the header
>   at 
> org.apache.commons.compress.archivers.tar.TarArchiveInputStream.getNextTarEntry(TarArchiveInputStream.java:385)
> {code}
> I made a bisect search and this reveals the error is introduced with 
> COMPRESS-509. I made an analysis and found the following why the problem 
> appears
>  * Error ist thrown after reading a file in the tar but the file has a "/" at 
> the end
>  * In the currEntry.isGNULongNameEntry the currEntry.isDirectory evaluates to 
> true for this file -> reason why the "/" is appended
>  * The underlying problem is that isDirectory also checks if the entry has a 
> "/" at the end. But the decoded file name is not yet set in the currEntry. 
> Since the entry before the file is a folder the name ends with "/" and 
> isDirectory returns the wrong value
> Since I already came this far I'll send a pulll request where the decoded 
> name is set before checking if we need to append "/" to the entry name. With 
> this change the ActiveMQ tar can be extracted correctly again.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   3   >