On Thu, 6 May 2021 15:20:23 GMT, Сергей Цыпанов 
<github.com+10835776+stsypa...@openjdk.org> wrote:

> Hello, from discussion in https://github.com/openjdk/jdk/pull/3464 and 
> https://github.com/openjdk/jdk/pull/2212 it appears, that in `j.l.Class` 
> expressions like
> String str = baseName.replace('.', '/') + '/' + name;
> are not compiled into invokedynamic-based code, but into one using 
> `StringBuilder`.
> This happens due to some bootstraping issues. Currently the bytecode for the 
> last (most often used) branch of `Class.descriptorString()` looks like
> public sb()Ljava/lang/String;
>    L0
>     LINENUMBER 21 L0
>     NEW java/lang/StringBuilder
>     DUP
>     INVOKESPECIAL java/lang/StringBuilder.<init> ()V
>     ASTORE 1
>    L1
>     LINENUMBER 23 L1
>     ALOAD 1
>     LDC "a"
>     INVOKEVIRTUAL java/lang/StringBuilder.append 
> (Ljava/lang/String;)Ljava/lang/StringBuilder;
>     POP
>    L2
>     LINENUMBER 24 L2
>     ALOAD 1
>     LDC "b"
>     INVOKEVIRTUAL java/lang/StringBuilder.append 
> (Ljava/lang/String;)Ljava/lang/StringBuilder;
>     POP
>    L3
>     LINENUMBER 26 L3
>     ALOAD 1
>     INVOKEVIRTUAL java/lang/StringBuilder.toString ()Ljava/lang/String;
> Here the `StringBuilder` is created with default constructor and then expands 
> if necessary while appending. 
> This can be improved by manually allocating `StringBuilder` of exact size. 
> The benchmark demonstrates measurable improvement:
> @State(Scope.Benchmark)
> @BenchmarkMode(Mode.AverageTime)
> @OutputTimeUnit(TimeUnit.NANOSECONDS)
> @Fork(jvmArgsAppend = {"-Xms2g", "-Xmx2g"})
> public class ClassDescriptorStringBenchmark {
>     private final Class<?> clazzWithShortDescriptor = Object.class;
>     private final Class<?> clazzWithLongDescriptor = getClass();
>     @Benchmark
>     public String descriptorString_short() {
>         return clazzWithShortDescriptor.descriptorString();
>     }
>     @Benchmark
>     public String descriptorString_long() {
>         return clazzWithLongDescriptor.descriptorString();
>     }
> }
> original
> -Xint
>                                                Mode     Score     Error   
> Units
> descriptorString_long                          avgt  6326.478 ± 107.251   
> ns/op
> descriptorString_short                         avgt  5220.729 ± 103.545   
> ns/op
> descriptorString_long:·gc.alloc.rate.norm      avgt   528.089 ±   0.021    
> B/op
> descriptorString_short:·gc.alloc.rate.norm     avgt   232.036 ±   0.015    
> B/op
> -XX:TieredStopAtLevel=1
>                                                Mode      Score    Error   
> Units
> descriptorString_long                          avgt    230.223 ±  1.254   
> ns/op
> descriptorString_short                         avgt    164.255 ±  0.755   
> ns/op
> descriptorString_long:·gc.alloc.rate.norm      avgt    528.046 ±  0.002    
> B/op
> descriptorString_short:·gc.alloc.rate.norm     avgt    232.022 ±  0.001    
> B/op
> full
>                                                Mode      Score     Error   
> Units
> descriptorString_long                          avgt     74.835 ±   0.262   
> ns/op
> descriptorString_short                         avgt     43.822 ±   0.788   
> ns/op
> descriptorString_long:·gc.alloc.rate.norm      avgt    504.010 ±   0.001    
> B/op
> descriptorString_short:·gc.alloc.rate.norm     avgt    208.004 ±   0.001    
> B/op
> ------------------------
> patched
> -Xint
>                                                Mode      Score     Error   
> Units
> descriptorString_long                          avgt   4485.994 ±  60.173   
> ns/op
> descriptorString_short                         avgt   3949.965 ± 278.143   
> ns/op
> descriptorString_long:·gc.alloc.rate.norm      avgt    336.051 ±   0.004    
> B/op
> descriptorString_short:·gc.alloc.rate.norm     avgt    184.027 ±   0.010    
> B/op
> -XX:TieredStopAtLevel=1
>                                                Mode        Score    Error   
> Units
> descriptorString_long                          avgt      185.774 ±  1.100   
> ns/op
> descriptorString_short                         avgt      135.338 ±  1.066   
> ns/op
> descriptorString_long:·gc.alloc.rate.norm      avgt      336.030 ±  0.001    
> B/op
> descriptorString_short:·gc.alloc.rate.norm     avgt      184.019 ±  0.001    
> B/op
> full
>                                                Mode      Score     Error   
> Units
> descriptorString_long                          avgt     42.864 ±   0.160   
> ns/op
> descriptorString_short                         avgt     27.255 ±   0.381   
> ns/op
> descriptorString_long:·gc.alloc.rate.norm      avgt    224.005 ±   0.001    
> B/op
> descriptorString_short:·gc.alloc.rate.norm     avgt    120.002 ±   0.001    
> B/op
> Same can be done also for Class.isHidden() branch in Class.descriptorString() 
> and for Class.getCanonicalName0()

> Together with #3627 this allows to reduce [minimalistic Spring Boot 
> application 
> start-up](https://github.com/stsypanov/spring-boot-benchmark/blob/master/src/main/java/com/tsypanov/sbb/SpringBootApplicationBenchmark.java)
>  time from 653 to 645 milliseconds and memory consumprion from 43804 to 43668 
> kB.

How do you run this benchmark? Something like `-bm ss -f 20`? Otherwise 
repeatedly invoking the spring boot initialization in a JMH benchmark method 
doesn't seem to model startup very realistically - unless that capture some 
iterative development scenario. Since JMH itself loads quite a bit of things on 
startup it likely skews your results somewhat - our startup tests are typically 
more barebone scripts that repeatedly run the app and capture the time to 
"start" and time to run the JVM to completion.


PR: https://git.openjdk.java.net/jdk/pull/3903

Reply via email to