On Fri, 8 Oct 2021 23:31:32 GMT, Mandy Chung <mch...@openjdk.org> wrote:
>> This reimplements core reflection with method handles. >> >> For `Constructor::newInstance` and `Method::invoke`, the new implementation >> uses `MethodHandle`. For `Field` accessor, the new implementation uses >> `VarHandle`. For the first few invocations of one of these reflective >> methods on a specific reflective object we invoke the corresponding method >> handle directly. After that we spin a dynamic bytecode stub defined in a >> hidden class which loads the target `MethodHandle` or `VarHandle` from its >> class data as a dynamically computed constant. Loading the method handle >> from a constant allows JIT to inline the method-handle invocation in order >> to achieve good performance. >> >> The VM's native reflection methods are needed during early startup, before >> the method-handle mechanism is initialized. That happens soon after >> System::initPhase1 and before System::initPhase2, after which we switch to >> using method handles exclusively. >> >> The core reflection and method handle implementation are updated to handle >> chained caller-sensitive method calls [1] properly. A caller-sensitive >> method can define with a caller-sensitive adapter method that will take an >> additional caller class parameter and the adapter method will be annotated >> with `@CallerSensitiveAdapter` for better auditing. See the detailed >> description from [2]. >> >> Ran tier1-tier8 tests. >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8013527 >> [2] >> https://bugs.openjdk.java.net/browse/JDK-8271820?focusedCommentId=14439430&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14439430 > > Mandy Chung has updated the pull request incrementally with one additional > commit since the last revision: > > Fix left-over assignment [8cb8071](https://github.com/openjdk/jdk/pull/5027/commits/8cb8071d9a085349139215c8472730193650b247) adds the setup code to pollute the profile to avoid unwanted inlining in some cases. The benchmark numbers are now sensible where the `Var` cases should not perform better than `Const` cases. I observed that if `polluteProfile` is false, some `Var` cases perform better than `Const` cases. Updated performance result: Baseline (jdk-18+17) Benchmark Mode Cnt Score Error Units ReflectionSpeedBenchmark.constructorConst avgt 10 68.049 ± 0.872 ns/op ReflectionSpeedBenchmark.constructorPoly avgt 10 94.132 ± 1.805 ns/op ReflectionSpeedBenchmark.constructorVar avgt 10 64.543 ± 0.799 ns/op ReflectionSpeedBenchmark.instanceFieldConst avgt 10 35.361 ± 0.492 ns/op ReflectionSpeedBenchmark.instanceFieldPoly avgt 10 67.089 ± 3.288 ns/op ReflectionSpeedBenchmark.instanceFieldVar avgt 10 35.745 ± 0.554 ns/op ReflectionSpeedBenchmark.instanceMethodConst avgt 10 77.925 ± 2.026 ns/op ReflectionSpeedBenchmark.instanceMethodPoly avgt 10 96.094 ± 2.269 ns/op ReflectionSpeedBenchmark.instanceMethodVar avgt 10 80.002 ± 4.267 ns/op ReflectionSpeedBenchmark.staticFieldConst avgt 10 33.442 ± 2.659 ns/op ReflectionSpeedBenchmark.staticFieldPoly avgt 10 51.918 ± 1.522 ns/op ReflectionSpeedBenchmark.staticFieldVar avgt 10 33.967 ± 0.451 ns/op ReflectionSpeedBenchmark.staticMethodConst avgt 10 75.380 ± 1.660 ns/op ReflectionSpeedBenchmark.staticMethodPoly avgt 10 93.553 ± 1.037 ns/op ReflectionSpeedBenchmark.staticMethodVar avgt 10 76.728 ± 1.614 ns/op JEP 417 Benchmark Mode Cnt Score Error Units ReflectionSpeedBenchmark.constructorConst avgt 10 32.392 ± 0.473 ns/op ReflectionSpeedBenchmark.constructorPoly avgt 10 113.947 ± 1.205 ns/op ReflectionSpeedBenchmark.constructorVar avgt 10 76.885 ± 1.128 ns/op ReflectionSpeedBenchmark.instanceFieldConst avgt 10 18.569 ± 0.161 ns/op ReflectionSpeedBenchmark.instanceFieldPoly avgt 10 98.671 ± 2.015 ns/op ReflectionSpeedBenchmark.instanceFieldVar avgt 10 54.193 ± 3.510 ns/op ReflectionSpeedBenchmark.instanceMethodConst avgt 10 33.421 ± 0.406 ns/op ReflectionSpeedBenchmark.instanceMethodPoly avgt 10 109.129 ± 1.959 ns/op ReflectionSpeedBenchmark.instanceMethodVar avgt 10 90.420 ± 2.187 ns/op ReflectionSpeedBenchmark.staticFieldConst avgt 10 19.080 ± 0.179 ns/op ReflectionSpeedBenchmark.staticFieldPoly avgt 10 92.130 ± 2.729 ns/op ReflectionSpeedBenchmark.staticFieldVar avgt 10 53.899 ± 1.051 ns/op ReflectionSpeedBenchmark.staticMethodConst avgt 10 35.907 ± 0.456 ns/op ReflectionSpeedBenchmark.staticMethodPoly avgt 10 102.895 ± 1.604 ns/op ReflectionSpeedBenchmark.staticMethodVar avgt 10 82.123 ± 0.629 ns/op I also ran the following benchmarks which show no performance degradation: - Peter's custom JSON serialization and deserialization benchmark with Jackson - XStream converter type benchmark - Kryo field serializer benchmark Microbenchmarks show that the performance of the new implementation is significantly faster than the old implementation (43-57% faster). When `Field`, `Method`, and `Constructor` instances are held in non-constant fields (e.g. non-final field or an array element), microbenchmarks show some performance degradation. While the performance degradation is signifcant (51-77%) for field access when `Field` instances cannot be constant folded, it might not impact real-world applications. We will need the developers to test with early-access builds to help identify any behavior and performance regressions. We will also explore a post-JEP integration performance improvements such as refining bytecode shapes for field access to enable concrete `MethodHandle` or `VarHandle` be reliably optimized by JIT irrespective of whether receiver is constant or not. @plevart @cl4es if you agree, can you please do the (hopefully) final review. ------------- PR: https://git.openjdk.java.net/jdk/pull/5027