[ 
https://issues.apache.org/jira/browse/COMPRESS-466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16633774#comment-16633774
 ] 

Jakob Sultan Ericsson commented on COMPRESS-466:
------------------------------------------------

I made a change to support only reading central directory. 

I haven't added any support for multiple entries with the same name or any 
unicode support in comments.

https://github.com/jakeri/commons-compress/tree/COMPRESS-466

My 35gb.zip went to 5-6 minutes to 17-18 seconds. The time is now spent in 
building central directory information.

Pure speculation but maybe this time could be decreased even more if you read 
the central directory to memory once (sacrifice memory for speed) and then 
build the directory information by reading from a large ByteBuffer.

> Opening of a very large zip file is extremely slow compared to 
> java.util.zip.ZipFile
> ------------------------------------------------------------------------------------
>
>                 Key: COMPRESS-466
>                 URL: https://issues.apache.org/jira/browse/COMPRESS-466
>             Project: Commons Compress
>          Issue Type: Bug
>          Components: Compressors
>    Affects Versions: 1.18
>         Environment: Tested both on Linux and OSX 10.13.6.
>            Reporter: Jakob Sultan Ericsson
>            Priority: Major
>
> We have a quite large zip file 35 gb and try to open this with ZipFile. 
> {code:java}
>         try (ZipFile zf = new ZipFile(new File("35gb.zip"))) {
>             System.out.println("File opened..." + (System.currentTimeMillis() 
> - start));
>         }
> {code}
> This code takes about 300 000 - 400 000 ms (5-6 minutes).
> If I run this with JDK-builtin java.util.zip.ZipFile, same code takes 300 ms 
> (less than a second). 
> I'm not totally sure what it is the problem but I did some debugging and 
> basically all time is spent in
> {code:java}
>     private void resolveLocalFileHeaderData(final Map<ZipArchiveEntry, 
> NameAndComment> entriesWithoutUTF8Flag)
> {code}
> Anything that can be done to improve this?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to