Speeding up jmod

2016-04-30 Thread Claes Redestad

Hi,

Alan asked me to take a look at jmod performance (also jlink, but saving 
that for another day), so I set

up a naive benchmark[1] and started profiling.

... and saw nothing really suspicious except that time is split between 
doing I/O and executing native code in
libz.so, which I guess isn't surprising. Oddly enough the only java 
methods that even show up in
profiles are related to writing, so I figured taking a closer look at 
the code for writing output from jmod
wouldn't hurt. Turns out I was wrong, since I soon found that the output 
stream used by JmodTask is

unbuffered...

Applied a trivial patch[2] and results of running the micro with -f 10 
-i 1 -bm ss (which is more or less like

running jmod standalone):

Benchmark   Mode  Cnt  Score   Error  Units
JmodBenchmark.jmodJavaBasess   10  1.966 ± 0.297   s/op # before
JmodBenchmark.jmodJavaBasess   10  1.196 ± 0.142   s/op # after

Seems like a notable reduction right there. Timing runs of jmod 
standalone gives analogous results on

real time, but user time is still almost as high.

Poking around further and it's obvious JIT threads are eating a larger 
portion of my cycles now - likely C2 is
ramping up but not having time to get much done in the short life-time 
of jmod, which is mostly spent in
native code anyhow. Switching to running short-running apps with only C1 
can be profitable, especially on
machines with a lot of cores (like the 2x8x2 machine I'm running this 
on), so I ran the numbers:


Again, with time:

Benchmark   Mode  Cnt  Score   Error  Units
JmodBenchmark.jmodJavaBasess   10  1.175 ± 0.147   s/op

real0m17.140s
user0m54.868s
sys0m4.172s

-XX:TieredStopAtLevel=1

BenchmarkMode  Cnt  Score   Error  Units
JmodBenchmark.jmodJavaBase  thrpt   10  1.075 ± 0.194  ops/s

real0m14.810s
user0m15.556s
sys0m1.584s

Yep, only running "C1" improves things a lot in this case and on my 
environment.


I suggest accepting the patch[2] as well as switching the jmod runner to 
run with -XX:TieredStopAtLevel=1
or similar. Both are likely needed for most to see any effect on build 
times.


A long term alternative to consider might be to implement a server-based 
jmod akin to the javac server.


Thanks!

/Claes

[1] http://cr.openjdk.java.net/~redestad/scratch/JmodBenchmark.java - 
assumes the java.base directory
of module compiled sources are available in the working directory, which 
I won't include here for obvious

reasons. -XaddExports:jdk.jlink/jdk.tools.jmod=ALL-UNNAMED

[2] http://cr.openjdk.java.net/~redestad/scratch/jmod_buff.01/



Re: Speeding up jmod

2016-05-01 Thread Alan Bateman

On 01/05/2016 03:02, Claes Redestad wrote:

Hi,

Alan asked me to take a look at jmod performance (also jlink, but 
saving that for another day), so I set

up a naive benchmark[1] and started profiling.

... and saw nothing really suspicious except that time is split 
between doing I/O and executing native code in
libz.so, which I guess isn't surprising. Oddly enough the only java 
methods that even show up in
profiles are related to writing, so I figured taking a closer look at 
the code for writing output from jmod
wouldn't hurt. Turns out I was wrong, since I soon found that the 
output stream used by JmodTask is

unbuffered...

Applied a trivial patch[2] and results of running the micro with -f 10 
-i 1 -bm ss (which is more or less like

running jmod standalone):

Benchmark   Mode  Cnt  Score   Error  Units
JmodBenchmark.jmodJavaBasess   10  1.966 ± 0.297   s/op # before
JmodBenchmark.jmodJavaBasess   10  1.196 ± 0.142   s/op # after

Seems like a notable reduction right there. Timing runs of jmod 
standalone gives analogous results on

real time, but user time is still almost as high.

Poking around further and it's obvious JIT threads are eating a larger 
portion of my cycles now - likely C2 is
ramping up but not having time to get much done in the short life-time 
of jmod, which is mostly spent in
native code anyhow. Switching to running short-running apps with only 
C1 can be profitable, especially on
machines with a lot of cores (like the 2x8x2 machine I'm running this 
on), so I ran the numbers:


Again, with time:

Benchmark   Mode  Cnt  Score   Error  Units
JmodBenchmark.jmodJavaBasess   10  1.175 ± 0.147   s/op

real0m17.140s
user0m54.868s
sys0m4.172s

-XX:TieredStopAtLevel=1

BenchmarkMode  Cnt  Score   Error  Units
JmodBenchmark.jmodJavaBase  thrpt   10  1.075 ± 0.194  ops/s

real0m14.810s
user0m15.556s
sys0m1.584s

Yep, only running "C1" improves things a lot in this case and on my 
environment.


I suggest accepting the patch[2] as well as switching the jmod runner 
to run with -XX:TieredStopAtLevel=1
or similar. Both are likely needed for most to see any effect on build 
times.


A long term alternative to consider might be to implement a 
server-based jmod akin to the javac server.

Thanks Claes, this is good analysis!

The create method should be using a BufferedOutputStream, I'm surprised 
that it isn't. 'll get that patch in the current refresh although it 
looks like this helps more with the benchmark that with the build.


I changed make/CreateJmods.java to use -XX:TieredStopAtLevel=1 and make 
a bit difference in the build. The wall clock time to create the jmods 
on my local machine drops from 46s. to 22s. I also tried a remote 
Windows machine and the time to create the jmods also dropped by about 20s.


I'm sure Erik will have advice on how to fit this in. As things stand, 
the VM options for the jmod command are configured in spec.gmk.in to to 
use $(JAVA_TOOL_FLAGS_SMALL). Maybe it's time to change 
JAVA_TOOL_FLAGS_SMALL as it it seems to be  -XX:+UseSerialGC and some 
heap settings at this time.


As regards changing the jmod launcher then one concern with that it the 
options might conflict with options specified via -J so would need to 
look at that more closely.


-Alan


Re: Speeding up jmod

2016-05-01 Thread Andrej Golovnin
Hi Alan,

If you would like to improve the performance a little bit more,
you can make additional changes (see the patch) to JmodTask
and maybe Claes can rerun his tests to see if it helps.

The patch makes following changes:

- reads module info in a little bit more efficient way.
- avoids duplicate searching for module info
  when hashing of modules is required.
- uses String#lastIndexOf(int) in toPackageName(ZipEntry)
  instead of String#lastIndexOf(String).
- avoids checking for every directory entry in a JarFile if it ends
  with “module-info.class”.

Best regards,
Andrej Golovnin
diff -r 7210b5dbd92f src/jdk.jlink/share/classes/jdk/tools/jmod/JmodTask.java
--- a/src/jdk.jlink/share/classes/jdk/tools/jmod/JmodTask.java  Sat Apr 30 
16:41:08 2016 -0700
+++ b/src/jdk.jlink/share/classes/jdk/tools/jmod/JmodTask.java  Sun May 01 
15:43:24 2016 +0200
@@ -373,7 +373,6 @@
 final String osArch = options.osArch;
 final String osVersion = options.osVersion;
 final List excludes = options.excludes;
-final Hasher hasher = hasher();
 
 JmodFileWriter() { }
 
@@ -400,31 +399,34 @@
  * on the class path of directories and JAR files.
  */
 Supplier newModuleInfoSupplier() throws IOException {
-ByteArrayOutputStream baos = new ByteArrayOutputStream();
+byte[] bytes = null;
 for (Path e: classpath) {
 if (Files.isDirectory(e)) {
 Path mi = e.resolve(MODULE_INFO);
 if (Files.isRegularFile(mi)) {
-Files.copy(mi, baos);
+bytes = Files.readAllBytes(mi);
 break;
 }
 } else if (Files.isRegularFile(e) && 
e.toString().endsWith(".jar")) {
 try (JarFile jf = new JarFile(e.toFile())) {
 ZipEntry entry = jf.getEntry(MODULE_INFO);
 if (entry != null) {
-jf.getInputStream(entry).transferTo(baos);
-break;
+byte[] tmp = new byte[(int) entry.getSize()];
+if (jf.getInputStream(entry).read(tmp) == 
tmp.length) {
+bytes = tmp;
+break;
+}
 }
 } catch (ZipException x) {
 // Skip. Do nothing. No packages will be added.
 }
 }
 }
-if (baos.size() == 0) {
+if (bytes == null) {
 return null;
 } else {
-byte[] bytes = baos.toByteArray();
-return () -> new ByteArrayInputStream(bytes);
+byte[] tmp = bytes;
+return () -> new ByteArrayInputStream(tmp);
 }
 }
 
@@ -450,7 +452,6 @@
 try (InputStream in = miSupplier.get()) {
 descriptor = ModuleDescriptor.read(in);
 }
-
 // copy the module-info.class into the jmod with the additional
 // attributes for the version, main class and other meta data
 try (InputStream in = miSupplier.get()) {
@@ -479,6 +480,7 @@
 if (moduleVersion != null)
 extender.version(moduleVersion);
 
+Hasher hasher = hasher(descriptor);
 if (hasher != null) {
 ModuleHashes moduleHashes = 
hasher.computeHashes(descriptor.name());
 if (moduleHashes != null) {
@@ -504,50 +506,36 @@
  * The jmod file is being created and does not exist in the
  * given modulepath.
  */
-private Hasher hasher() {
+private Hasher hasher(ModuleDescriptor descriptor) {
 if (options.modulesToHash == null)
 return null;
 
-try {
-Supplier miSupplier = newModuleInfoSupplier();
-if (miSupplier == null) {
-throw new IOException(MODULE_INFO + " not found");
+URI uri = options.jmodFile.toUri();
+ModuleReference mref = new ModuleReference(descriptor, uri, new 
Supplier<>() {
+@Override
+public ModuleReader get() {
+throw new UnsupportedOperationException();
 }
+});
 
-ModuleDescriptor descriptor;
-try (InputStream in = miSupplier.get()) {
-descriptor = ModuleDescriptor.read(in);
-}
+// compose a module finder with the module path and also
+// a module finder that can find the jmod file being created
+ModuleFinder finder = ModuleFinder.compose(options.moduleFinder,
+new ModuleFinder() {
+@Override
+public Optional find(Strin

Re: Speeding up jmod

2016-05-01 Thread Chris Hegarty

> On 1 May 2016, at 12:53, Alan Bateman  wrote:
> 
> On 01/05/2016 03:02, Claes Redestad wrote:
>> Hi,
>> 
>> Alan asked me to take a look at jmod performance (also jlink, but saving 
>> that for another day), so I set
>> up a naive benchmark[1] and started profiling.
>> 
>> ... and saw nothing really suspicious except that time is split between 
>> doing I/O and executing native code in
>> libz.so, which I guess isn't surprising. Oddly enough the only java methods 
>> that even show up in
>> profiles are related to writing, so I figured taking a closer look at the 
>> code for writing output from jmod
>> wouldn't hurt. Turns out I was wrong, since I soon found that the output 
>> stream used by JmodTask is
>> unbuffered...
>> 
>> Applied a trivial patch[2] and results of running the micro with -f 10 -i 1 
>> -bm ss (which is more or less like
>> running jmod standalone):
>> 
>> Benchmark   Mode  Cnt  Score   Error  Units
>> JmodBenchmark.jmodJavaBasess   10  1.966 ± 0.297   s/op # before
>> JmodBenchmark.jmodJavaBasess   10  1.196 ± 0.142   s/op # after
>> 
>> Seems like a notable reduction right there. Timing runs of jmod standalone 
>> gives analogous results on
>> real time, but user time is still almost as high.
>> 
>> Poking around further and it's obvious JIT threads are eating a larger 
>> portion of my cycles now - likely C2 is
>> ramping up but not having time to get much done in the short life-time of 
>> jmod, which is mostly spent in
>> native code anyhow. Switching to running short-running apps with only C1 can 
>> be profitable, especially on
>> machines with a lot of cores (like the 2x8x2 machine I'm running this on), 
>> so I ran the numbers:
>> 
>> Again, with time:
>> 
>> Benchmark   Mode  Cnt  Score   Error  Units
>> JmodBenchmark.jmodJavaBasess   10  1.175 ± 0.147   s/op
>> 
>> real0m17.140s
>> user0m54.868s
>> sys0m4.172s
>> 
>> -XX:TieredStopAtLevel=1
>> 
>> BenchmarkMode  Cnt  Score   Error  Units
>> JmodBenchmark.jmodJavaBase  thrpt   10  1.075 ± 0.194  ops/s
>> 
>> real0m14.810s
>> user0m15.556s
>> sys0m1.584s
>> 
>> Yep, only running "C1" improves things a lot in this case and on my 
>> environment.
>> 
>> I suggest accepting the patch[2] as well as switching the jmod runner to run 
>> with -XX:TieredStopAtLevel=1
>> or similar. Both are likely needed for most to see any effect on build times.
>> 
>> A long term alternative to consider might be to implement a server-based 
>> jmod akin to the javac server.
> Thanks Claes, this is good analysis!

Yes, this is great work. Thanks Claes.

> The create method should be using a BufferedOutputStream,

This was an oversight in the original implementation. The output
should be buffered.

> I'm surprised that it isn't. 'll get that patch in the current refresh 
> although it looks like this helps more with the benchmark that with the build.
> 
> I changed make/CreateJmods.java to use -XX:TieredStopAtLevel=1 and make a bit 
> difference in the build. The wall clock time to create the jmods on my local 
> machine drops from 46s. to 22s. I also tried a remote Windows machine and the 
> time to create the jmods also dropped by about 20s.

Wow, this is a real win. Good find.

> I'm sure Erik will have advice on how to fit this in. As things stand, the VM 
> options for the jmod command are configured in spec.gmk.in to to use 
> $(JAVA_TOOL_FLAGS_SMALL). Maybe it's time to change JAVA_TOOL_FLAGS_SMALL as 
> it it seems to be  -XX:+UseSerialGC and some heap settings at this time.

I would expect that a number of other tools could benefit from this
too.

-Chris.



Re: Speeding up jmod

2016-05-01 Thread Chris Hegarty
On 1 May 2016, at 14:50, Andrej Golovnin  wrote:
> 
> Hi Alan,
> 
> If you would like to improve the performance a little bit more,
> you can make additional changes (see the patch) to JmodTask
> and maybe Claes can rerun his tests to see if it helps.

These look good too, even as general clean ups, but it would
be good to get some numbers on them.

> The patch makes following changes:
> 
> - reads module info in a little bit more efficient way.
> - avoids duplicate searching for module info
>  when hashing of modules is required.
> - uses String#lastIndexOf(int) in toPackageName(ZipEntry)
>  instead of String#lastIndexOf(String).
> - avoids checking for every directory entry in a JarFile if it ends
>  with “module-info.class”.

-Chris.



Re: Speeding up jmod

2016-05-01 Thread Andrej Golovnin
Hi Chris, Alan,

I have updated my patch. Additionally to the other changes
it warps the InputStream returned by Files.newInputStream()
in a BufferedInputStream in the method
JmodFileWriter#processSection(ZipOutputStream, Section, Path)
because Files.newInputStream() returns an unbuffered InputStream
per specification.

Best regards,
Andrej Golovnin
diff -r 7210b5dbd92f src/jdk.jlink/share/classes/jdk/tools/jmod/JmodTask.java
--- a/src/jdk.jlink/share/classes/jdk/tools/jmod/JmodTask.java  Sat Apr 30 
16:41:08 2016 -0700
+++ b/src/jdk.jlink/share/classes/jdk/tools/jmod/JmodTask.java  Sun May 01 
19:09:04 2016 +0200
@@ -373,7 +373,6 @@
 final String osArch = options.osArch;
 final String osVersion = options.osVersion;
 final List excludes = options.excludes;
-final Hasher hasher = hasher();
 
 JmodFileWriter() { }
 
@@ -400,31 +399,34 @@
  * on the class path of directories and JAR files.
  */
 Supplier newModuleInfoSupplier() throws IOException {
-ByteArrayOutputStream baos = new ByteArrayOutputStream();
+byte[] bytes = null;
 for (Path e: classpath) {
 if (Files.isDirectory(e)) {
 Path mi = e.resolve(MODULE_INFO);
 if (Files.isRegularFile(mi)) {
-Files.copy(mi, baos);
+bytes = Files.readAllBytes(mi);
 break;
 }
 } else if (Files.isRegularFile(e) && 
e.toString().endsWith(".jar")) {
 try (JarFile jf = new JarFile(e.toFile())) {
 ZipEntry entry = jf.getEntry(MODULE_INFO);
 if (entry != null) {
-jf.getInputStream(entry).transferTo(baos);
-break;
+byte[] tmp = new byte[(int) entry.getSize()];
+if (jf.getInputStream(entry).read(tmp) == 
tmp.length) {
+bytes = tmp;
+break;
+}
 }
 } catch (ZipException x) {
 // Skip. Do nothing. No packages will be added.
 }
 }
 }
-if (baos.size() == 0) {
+if (bytes == null) {
 return null;
 } else {
-byte[] bytes = baos.toByteArray();
-return () -> new ByteArrayInputStream(bytes);
+byte[] tmp = bytes;
+return () -> new ByteArrayInputStream(tmp);
 }
 }
 
@@ -450,7 +452,6 @@
 try (InputStream in = miSupplier.get()) {
 descriptor = ModuleDescriptor.read(in);
 }
-
 // copy the module-info.class into the jmod with the additional
 // attributes for the version, main class and other meta data
 try (InputStream in = miSupplier.get()) {
@@ -479,6 +480,7 @@
 if (moduleVersion != null)
 extender.version(moduleVersion);
 
+Hasher hasher = hasher(descriptor);
 if (hasher != null) {
 ModuleHashes moduleHashes = 
hasher.computeHashes(descriptor.name());
 if (moduleHashes != null) {
@@ -504,50 +506,36 @@
  * The jmod file is being created and does not exist in the
  * given modulepath.
  */
-private Hasher hasher() {
+private Hasher hasher(ModuleDescriptor descriptor) {
 if (options.modulesToHash == null)
 return null;
 
-try {
-Supplier miSupplier = newModuleInfoSupplier();
-if (miSupplier == null) {
-throw new IOException(MODULE_INFO + " not found");
+URI uri = options.jmodFile.toUri();
+ModuleReference mref = new ModuleReference(descriptor, uri, new 
Supplier<>() {
+@Override
+public ModuleReader get() {
+throw new UnsupportedOperationException();
 }
+});
 
-ModuleDescriptor descriptor;
-try (InputStream in = miSupplier.get()) {
-descriptor = ModuleDescriptor.read(in);
-}
+// compose a module finder with the module path and also
+// a module finder that can find the jmod file being created
+ModuleFinder finder = ModuleFinder.compose(options.moduleFinder,
+new ModuleFinder() {
+@Override
+public Optional find(String name) {
+if (descriptor.name().equals(name))
+return Optional.of(mref);
+else return Optional.empty();
+}
 
-URI uri = options.

Re: Speeding up jmod

2016-05-02 Thread Alan Bateman


On 01/05/2016 17:02, Chris Hegarty wrote:

On 1 May 2016, at 14:50, Andrej Golovnin  wrote:

Hi Alan,

If you would like to improve the performance a little bit more,
you can make additional changes (see the patch) to JmodTask
and maybe Claes can rerun his tests to see if it helps.

These look good too, even as general clean ups, but it would
be good to get some numbers on them.

Andrej - aside from the micro benchmarks, the real test is `make jmods` 
where the tool runs N times (N = number of modules) to create the 
packaged modules. Short lived tools have always been the JDK's Achilles 
heel. It may be that we will eventually need a "jmod server", like the 
sjavac server in the build today. Alternative it could be wrapped in 
another tool that capable of creating several packaged modules. There is 
a balance here as we also don't want to over complicated things.


-Alan


Re: Speeding up jmod

2016-05-02 Thread Jonathan Gibbons

On 05/02/2016 12:06 AM, Alan Bateman wrote:


On 01/05/2016 17:02, Chris Hegarty wrote:
On 1 May 2016, at 14:50, Andrej Golovnin  
wrote:

Hi Alan,

If you would like to improve the performance a little bit more,
you can make additional changes (see the patch) to JmodTask
and maybe Claes can rerun his tests to see if it helps.

These look good too, even as general clean ups, but it would
be good to get some numbers on them.

Andrej - aside from the micro benchmarks, the real test is `make 
jmods` where the tool runs N times (N = number of modules) to create 
the packaged modules. Short lived tools have always been the JDK's 
Achilles heel. It may be that we will eventually need a "jmod server", 
like the sjavac server in the build today. Alternative it could be 
wrapped in another tool that capable of creating several packaged 
modules. There is a balance here as we also don't want to over 
complicated things.


-Alan



Rather than build "sjavac"-like server functionality into selected 
tools, maybe we could generalize the server part of sjavac to be a "tool 
server", capable of running any tool implementing a simple Tool interface.


-- Jon