This patch speeds up MethodHandle.asCollector handles where the array type is 
not Object[], as well as speeding up all collectors where the arity is greater 
than 10.

The old code is creating a collector handle by combining a set of hard coded 
methods for collecting arguments into an Object[], up to a maximum of ten 
elements, and then copying this intermediate array into a final array.

In principle, it shouldn't matter how slow (or fast) this handle is, because it 
should be replaced by existing bytecode intrinsification which does the right 
thing. But, through investigation it turns out that the intrinsification is 
only being applied in a very limited amount of cases: Object[] with max 10 
elements only, only for the intermediate collector handles. Every other 
collector shape incurs overhead because it essentially ends up calling the 
ineffecient fallback handle.

Rather than sticking on a band aid, the new code replaces the existing 
implementation with a collector handle implemented using a LambdaForm, which 
removes the need for intrinsification, and also greatly reduces code-complexity 
of the implementation. (I plan to expose this collector using a public API in 
the future as well, so users don't have to go through MHs::identity to make a 
collector).

The old code was also using a special lambda form transform for collecting 
arguments into an array. I believe this was done to take advantage of the 
existing-but-limited bytecode intrinsification, at least for Object[] with less 
than 10 elements.

The new code just uses the existing collect arguments transform with the newly 
added collector handle as filter, and this works just as well for the existing 
case, but as a bonus is also much simpler, since no separate transform is 
needed. Using the collect arguments transform should also improve sharing.

As a result of these changes a lot of code was unused and has been removed in 
this patch.

Testing: tier 1-3, benchmarking using TypedAsCollector (part of the patch), as 
well as another variant of the benchmark that used a declared static identity 
method instead of MHs::identity (not included). Before/after comparison of 
MethodHandleAs* benchmarks (no differences there).

Here are some numbers from the added benchmark:

Before:
Benchmark                                    Mode  Cnt    Score    Error  Units
TypedAsCollector.testIntCollect              avgt   30  189.156 �  1.796  ns/op
TypedAsCollector.testIntCollectHighArity     avgt   30  660.549 � 10.159  ns/op
TypedAsCollector.testObjectCollect           avgt   30    7.092 �  0.042  ns/op
TypedAsCollector.testObjectCollectHighArity  avgt   30   65.225 �  0.546  ns/op
TypedAsCollector.testStringCollect           avgt   30   28.511 �  0.243  ns/op
TypedAsCollector.testStringCollectHighArity  avgt   30   57.054 �  0.635  ns/op
(as you can see, just the Object[] with arity less than 10 case is fast here)
After:
Benchmark                                    Mode  Cnt  Score   Error  Units
TypedAsCollector.testIntCollect              avgt   30  6.569 � 0.131  ns/op
TypedAsCollector.testIntCollectHighArity     avgt   30  8.923 � 0.066  ns/op
TypedAsCollector.testObjectCollect           avgt   30  6.813 � 0.035  ns/op
TypedAsCollector.testObjectCollectHighArity  avgt   30  9.718 � 0.066  ns/op
TypedAsCollector.testStringCollect           avgt   30  6.737 � 0.016  ns/op
TypedAsCollector.testStringCollectHighArity  avgt   30  9.618 � 0.052  ns/op

Thanks,
Jorn

-------------

Commit messages:
 - WIP - clean up
 - Completely implement collect in lambdaform
 - WIP - fix MH collector perf, with jumbo arity bandaid

Changes: https://git.openjdk.java.net/jdk/pull/3306/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3306&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8264288
  Stats: 513 lines in 8 files changed: 154 ins; 337 del; 22 mod
  Patch: https://git.openjdk.java.net/jdk/pull/3306.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/3306/head:pull/3306

PR: https://git.openjdk.java.net/jdk/pull/3306

Reply via email to