On Thu, 1 Apr 2021 13:25:05 GMT, Jorn Vernee <jver...@openjdk.org> wrote:
> This patch speeds up MethodHandle.asCollector handles where the array type is > not Object[], as well as speeding up all collectors where the arity is > greater than 10. > > The old code is creating a collector handle by combining a set of hard coded > methods for collecting arguments into an Object[], up to a maximum of ten > elements, and then copying this intermediate array into a final array. > > In principle, it shouldn't matter how slow (or fast) this handle is, because > it should be replaced by existing bytecode intrinsification which does the > right thing. But, through investigation it turns out that the > intrinsification is only being applied in a very limited amount of cases: > Object[] with max 10 elements only, only for the intermediate collector > handles. Every other collector shape incurs overhead because it essentially > ends up calling the ineffecient fallback handle. > > Rather than sticking on a band aid (I tried this, but it turned out to be > very tricky to untangle the existing code), the new code replaces the > existing implementation with a collector handle implemented using a > LambdaForm, which removes the need for intrinsification, and also greatly > reduces code-complexity of the implementation. (I plan to expose this > collector using a public API in the future as well, so users don't have to go > through MHs::identity to make a collector). > > The old code was also using a special lambda form transform for collecting > arguments into an array. I believe this was done to take advantage of the > existing-but-limited bytecode intrinsification, at least for Object[] with > less than 10 elements. > > The new code just uses the existing collect arguments transform with the > newly added collector handle as filter, and this works just as well for the > existing case, but as a bonus is also much simpler, since no separate > transform is needed. Using the collect arguments transform should also > improve sharing. > > As a result of these changes a lot of code was unused and has been removed in > this patch. > > Testing: tier 1-3, benchmarking using TypedAsCollector (part of the patch), > as well as another variant of the benchmark that used a declared static > identity method instead of MHs::identity (not included). Before/after > comparison of MethodHandleAs* benchmarks (no differences there). > > Here are some numbers from the added benchmark: > > Before: > Benchmark Mode Cnt Score Error > Units > TypedAsCollector.testIntCollect avgt 30 189.156 � 1.796 > ns/op > TypedAsCollector.testIntCollectHighArity avgt 30 660.549 � 10.159 > ns/op > TypedAsCollector.testObjectCollect avgt 30 7.092 � 0.042 > ns/op > TypedAsCollector.testObjectCollectHighArity avgt 30 65.225 � 0.546 > ns/op > TypedAsCollector.testStringCollect avgt 30 28.511 � 0.243 > ns/op > TypedAsCollector.testStringCollectHighArity avgt 30 57.054 � 0.635 > ns/op > (as you can see, just the Object[] with arity less than 10 case is fast here) > After: > Benchmark Mode Cnt Score Error Units > TypedAsCollector.testIntCollect avgt 30 6.569 � 0.131 ns/op > TypedAsCollector.testIntCollectHighArity avgt 30 8.923 � 0.066 ns/op > TypedAsCollector.testObjectCollect avgt 30 6.813 � 0.035 ns/op > TypedAsCollector.testObjectCollectHighArity avgt 30 9.718 � 0.066 ns/op > TypedAsCollector.testStringCollect avgt 30 6.737 � 0.016 ns/op > TypedAsCollector.testStringCollectHighArity avgt 30 9.618 � 0.052 ns/op > > Thanks, > Jorn So many red deletion lines; I'm looking at a beautiful sunset! It's a sunset for some very old code, some of the first code I wrote for method handles, long before Lambda Forms made this sort of task easier. Thanks very much for cleaning this out. See a few minor comments on some diff lines. src/java.base/share/classes/java/lang/invoke/LambdaFormEditor.java line 651: > 649: } > 650: > 651: LambdaForm collectArgumentArrayForm(int pos, MethodHandle > arrayCollector) { It's counter-intuitive that removing a LFE tactic would be harmless. Each LFE tactic is a point where LFs can be shared, reducing footprint. But in this case `collectArgumentArrayForm` is always paired with `collectArgumentsForm`, so the latter takes care of sharing. The actual code which makes the arrays is also shared, via `Makers.TYPED_COLLECTORS` (unchanged). src/java.base/share/classes/java/lang/invoke/MethodHandleImpl.java line 1342: > 1340: } > 1341: > 1342: // Array identity function (used as > getConstantHandle(MH_arrayIdentity)). If this is no longer used, remove it. src/java.base/share/classes/java/lang/invoke/MethodHandleImpl.java line 1903: > 1901: } > 1902: > 1903: private static LambdaForm makeCollectorForm(MethodType basicType, > MethodHandle storeFunc) { This reads very well. It is the right way to do this job with LFs. There is a slight risk that the JIT will fail to inline the tiny MHs that (a) construct arrays and (b) poke elements into them. If that happens, then perhaps the bytecode generator can treat them as intrinsics. Maybe that's worth a comment in the code, to help later readers? ------------- Marked as reviewed by jrose (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3306