Hi Claes,

will you open a bug for this?

Thanks
Christoph

> -----Original Message-----
> From: core-libs-dev <core-libs-dev-boun...@openjdk.java.net> On Behalf
> Of Lennart Börjeson
> Sent: Dienstag, 16. April 2019 09:05
> To: Claes Redestad <claes.redes...@oracle.com>
> Cc: core-libs-dev@openjdk.java.net
> Subject: Re: ZipFileSystem performance regression
> 
> Very good, thank you!
> 
> Also note that the "new" implementation also requires *a lot* more heap,
> since all *uncompressed* file content is copied to the heap before deflating.
> 
> Best regards,
> 
> /Lennart Börjeson
> 
> > 15 apr. 2019 kl. 18:34 skrev Claes Redestad <claes.redes...@oracle.com>:
> >
> > Hi Lennart,
> >
> > I can reproduce this locally, and have narrowed down to
> > https://bugs.openjdk.java.net/browse/JDK-8034802 as the cause.
> >
> > As you say the compression is deferred to ZipFileSystem.close() now,
> > whereas previously it happened eagerly. We will have to analyze the
> > changes more in-depth to try and see why this is the case.
> >
> > Thanks!
> >
> > /Claes
> >
> > On 2019-04-15 15:32, Lennart Börjeson wrote:
> >> I have made a small command-line utility which creates zip archives by
> compressing the input files in parallel.
> >> I do this by calling Files.copy(input, zipOutputStream) in a parallel 
> >> Stream
> over all input files.
> >> I have run this with Java 1.8, 9, 10, and 11, on both my local laptop and 
> >> on
> server-class machines with up to 40 cores.
> >> It scales rather well and I get the expected speedup, i.e. roughly
> proportional to the number of cores.
> >> With OpenJDK 12, however, I get no speedup whatsoever. My
> understanding is that in JDK 12, the copy method simply
> >> transfers the contents of the input stream to a ByteArrayOutputStream,
> which is then deflated when I close the ZipFileSystem (by the
> ZipFileSystem.sync() method).
> >> Previously, the deflation was done when in the call to Files.copy, thus
> executed in parallel, and the final ZipFileSystem.close() didn't do anything
> much.
> >> (But I may of course be wrong. In case there's a simpler explanation and
> something I can remedy in my code, please let me know.)
> >> I have a small GitHub gist for the utility here:
> https://gist.github.com/lenborje/6d2f92430abe4ba881e3c5ff83736923
> <https://gist.github.com/lenborje/6d2f92430abe4ba881e3c5ff83736923>
> >> Steps to reproduce (timings on my 8-core MacBook Pro):
> >> 1) Get the contents of the gist as a single file, Zip.java
> >> 2) Compile it using Java 8.
> >> $ export JAVA_HOME=<PATH-TO-JDK8-HOME>
> >> $ javac -encoding utf8 Zip.java
> >> 3) run on a sufficiently large number of files to exercise the parallelity:
> (I've used about 70 text files ca 60MB each)
> >> $ time java -Xmx6g Zip -p /tmp/test.zip <WHATEVER>/*.log.
> >> Working on ZIP FileSystem jar:file:/tmp/test.zip, using the options
> [PARALLEL]
> >> ...
> >> completed zip archive, now closing... done!
> >> real       0m35.558s
> >> user       3m58.134s
> >> sys        0m5.543s
> >> As is evident from the ratio between "user time" and "real time", all cores
> have been busy most of the time.
> >> (Running with JDK 9, 10, and 11 produces similar timings.)
> >> But running with JDK 12 defeats the parallelism:
> >> $ export JAVA_HOME=<PATH-TO-JDK12-HOME>
> >> $ rm /tmp/test.zip # From previous run
> >> $ time java -Xmx6g Zip -p /tmp/test.zip <WHATEVER>/*.log.
> >> Working on ZIP FileSystem jar:file:/tmp/test.zip, using the options
> [PARALLEL]
> >> ...
> >> completed zip archive, now closing... done!
> >> real       3m1.187s
> >> user       3m5.422s
> >> sys        0m12.396s
> >> Now there's almost no speedup. When observing the output, note that
> the ZipFileSystem.close() method is called immediately after the "now
> closing..." output, and "Done!" Is written when it returns, and when running
> with JDK 12 almost all running time is apparently spent there.
> >> I'm hoping the previous behaviour could somehow be restored, i.e. that
> deflation actually happens when I'm copying the input files to the
> ZipFileSystem, and not when I close it.
> >> Best regards,
> >> /Lennart Börjeson
> >>> 12 apr. 2019 kl. 14:25 skrev Lennart Börjeson <lenbo...@gmail.com>:
> >>>
> >>> I've found what I believe is a rather severe performance regression in
> ZipFileSystem. 1.8 and 11 runs OK, 12 does not.
> >>>
> >>> Is this the right forum to report such issues?
> >>>
> >>> Best regards,
> >>>
> >>> /Lennart Börjeson

Reply via email to