Hi Claes, will you open a bug for this?
Thanks Christoph > -----Original Message----- > From: core-libs-dev <core-libs-dev-boun...@openjdk.java.net> On Behalf > Of Lennart Börjeson > Sent: Dienstag, 16. April 2019 09:05 > To: Claes Redestad <claes.redes...@oracle.com> > Cc: core-libs-dev@openjdk.java.net > Subject: Re: ZipFileSystem performance regression > > Very good, thank you! > > Also note that the "new" implementation also requires *a lot* more heap, > since all *uncompressed* file content is copied to the heap before deflating. > > Best regards, > > /Lennart Börjeson > > > 15 apr. 2019 kl. 18:34 skrev Claes Redestad <claes.redes...@oracle.com>: > > > > Hi Lennart, > > > > I can reproduce this locally, and have narrowed down to > > https://bugs.openjdk.java.net/browse/JDK-8034802 as the cause. > > > > As you say the compression is deferred to ZipFileSystem.close() now, > > whereas previously it happened eagerly. We will have to analyze the > > changes more in-depth to try and see why this is the case. > > > > Thanks! > > > > /Claes > > > > On 2019-04-15 15:32, Lennart Börjeson wrote: > >> I have made a small command-line utility which creates zip archives by > compressing the input files in parallel. > >> I do this by calling Files.copy(input, zipOutputStream) in a parallel > >> Stream > over all input files. > >> I have run this with Java 1.8, 9, 10, and 11, on both my local laptop and > >> on > server-class machines with up to 40 cores. > >> It scales rather well and I get the expected speedup, i.e. roughly > proportional to the number of cores. > >> With OpenJDK 12, however, I get no speedup whatsoever. My > understanding is that in JDK 12, the copy method simply > >> transfers the contents of the input stream to a ByteArrayOutputStream, > which is then deflated when I close the ZipFileSystem (by the > ZipFileSystem.sync() method). > >> Previously, the deflation was done when in the call to Files.copy, thus > executed in parallel, and the final ZipFileSystem.close() didn't do anything > much. > >> (But I may of course be wrong. In case there's a simpler explanation and > something I can remedy in my code, please let me know.) > >> I have a small GitHub gist for the utility here: > https://gist.github.com/lenborje/6d2f92430abe4ba881e3c5ff83736923 > <https://gist.github.com/lenborje/6d2f92430abe4ba881e3c5ff83736923> > >> Steps to reproduce (timings on my 8-core MacBook Pro): > >> 1) Get the contents of the gist as a single file, Zip.java > >> 2) Compile it using Java 8. > >> $ export JAVA_HOME=<PATH-TO-JDK8-HOME> > >> $ javac -encoding utf8 Zip.java > >> 3) run on a sufficiently large number of files to exercise the parallelity: > (I've used about 70 text files ca 60MB each) > >> $ time java -Xmx6g Zip -p /tmp/test.zip <WHATEVER>/*.log. > >> Working on ZIP FileSystem jar:file:/tmp/test.zip, using the options > [PARALLEL] > >> ... > >> completed zip archive, now closing... done! > >> real 0m35.558s > >> user 3m58.134s > >> sys 0m5.543s > >> As is evident from the ratio between "user time" and "real time", all cores > have been busy most of the time. > >> (Running with JDK 9, 10, and 11 produces similar timings.) > >> But running with JDK 12 defeats the parallelism: > >> $ export JAVA_HOME=<PATH-TO-JDK12-HOME> > >> $ rm /tmp/test.zip # From previous run > >> $ time java -Xmx6g Zip -p /tmp/test.zip <WHATEVER>/*.log. > >> Working on ZIP FileSystem jar:file:/tmp/test.zip, using the options > [PARALLEL] > >> ... > >> completed zip archive, now closing... done! > >> real 3m1.187s > >> user 3m5.422s > >> sys 0m12.396s > >> Now there's almost no speedup. When observing the output, note that > the ZipFileSystem.close() method is called immediately after the "now > closing..." output, and "Done!" Is written when it returns, and when running > with JDK 12 almost all running time is apparently spent there. > >> I'm hoping the previous behaviour could somehow be restored, i.e. that > deflation actually happens when I'm copying the input files to the > ZipFileSystem, and not when I close it. > >> Best regards, > >> /Lennart Börjeson > >>> 12 apr. 2019 kl. 14:25 skrev Lennart Börjeson <lenbo...@gmail.com>: > >>> > >>> I've found what I believe is a rather severe performance regression in > ZipFileSystem. 1.8 and 11 runs OK, 12 does not. > >>> > >>> Is this the right forum to report such issues? > >>> > >>> Best regards, > >>> > >>> /Lennart Börjeson