Re: RFR: 8264288: Performance issue with MethodHandle.asCollector

John R Rose Thu, 01 Apr 2021 12:16:50 -0700

On Thu, 1 Apr 2021 13:25:05 GMT, Jorn Vernee <jver...@openjdk.org> wrote:


> This patch speeds up MethodHandle.asCollector handles where the array type is 
> not Object[], as well as speeding up all collectors where the arity is 
> greater than 10.
> 
> The old code is creating a collector handle by combining a set of hard coded 
> methods for collecting arguments into an Object[], up to a maximum of ten 
> elements, and then copying this intermediate array into a final array.
> 
> In principle, it shouldn't matter how slow (or fast) this handle is, because 
> it should be replaced by existing bytecode intrinsification which does the 
> right thing. But, through investigation it turns out that the 
> intrinsification is only being applied in a very limited amount of cases: 
> Object[] with max 10 elements only, only for the intermediate collector 
> handles. Every other collector shape incurs overhead because it essentially 
> ends up calling the ineffecient fallback handle.
> 
> Rather than sticking on a band aid (I tried this, but it turned out to be 
> very tricky to untangle the existing code), the new code replaces the 
> existing implementation with a collector handle implemented using a 
> LambdaForm, which removes the need for intrinsification, and also greatly 
> reduces code-complexity of the implementation. (I plan to expose this 
> collector using a public API in the future as well, so users don't have to go 
> through MHs::identity to make a collector).
> 
> The old code was also using a special lambda form transform for collecting 
> arguments into an array. I believe this was done to take advantage of the 
> existing-but-limited bytecode intrinsification, at least for Object[] with 
> less than 10 elements.
> 
> The new code just uses the existing collect arguments transform with the 
> newly added collector handle as filter, and this works just as well for the 
> existing case, but as a bonus is also much simpler, since no separate 
> transform is needed. Using the collect arguments transform should also 
> improve sharing.
> 
> As a result of these changes a lot of code was unused and has been removed in 
> this patch.
> 
> Testing: tier 1-3, benchmarking using TypedAsCollector (part of the patch), 
> as well as another variant of the benchmark that used a declared static 
> identity method instead of MHs::identity (not included). Before/after 
> comparison of MethodHandleAs* benchmarks (no differences there).
> 
> Here are some numbers from the added benchmark:
> 
> Before:
> Benchmark                                    Mode  Cnt    Score    Error  
> Units
> TypedAsCollector.testIntCollect              avgt   30  189.156 �  1.796  
> ns/op
> TypedAsCollector.testIntCollectHighArity     avgt   30  660.549 � 10.159  
> ns/op
> TypedAsCollector.testObjectCollect           avgt   30    7.092 �  0.042  
> ns/op
> TypedAsCollector.testObjectCollectHighArity  avgt   30   65.225 �  0.546  
> ns/op
> TypedAsCollector.testStringCollect           avgt   30   28.511 �  0.243  
> ns/op
> TypedAsCollector.testStringCollectHighArity  avgt   30   57.054 �  0.635  
> ns/op
> (as you can see, just the Object[] with arity less than 10 case is fast here)
> After:
> Benchmark                                    Mode  Cnt  Score   Error  Units
> TypedAsCollector.testIntCollect              avgt   30  6.569 � 0.131  ns/op
> TypedAsCollector.testIntCollectHighArity     avgt   30  8.923 � 0.066  ns/op
> TypedAsCollector.testObjectCollect           avgt   30  6.813 � 0.035  ns/op
> TypedAsCollector.testObjectCollectHighArity  avgt   30  9.718 � 0.066  ns/op
> TypedAsCollector.testStringCollect           avgt   30  6.737 � 0.016  ns/op
> TypedAsCollector.testStringCollectHighArity  avgt   30  9.618 � 0.052  ns/op
> 
> Thanks,
> Jorn

So many red deletion lines; I'm looking at a beautiful sunset!
It's a sunset for some very old code, some of the first code
I wrote for method handles, long before Lambda Forms
made this sort of task easier.

Thanks very much for cleaning this out.  See a few
minor comments on some diff lines.

src/java.base/share/classes/java/lang/invoke/LambdaFormEditor.java line 651:

> 649:     }
> 650: 
> 651:     LambdaForm collectArgumentArrayForm(int pos, MethodHandle 
> arrayCollector) {

It's counter-intuitive that removing a LFE tactic would be harmless.
Each LFE tactic is a point where LFs can be shared, reducing footprint.
But in this case `collectArgumentArrayForm` is always paired with
`collectArgumentsForm`, so the latter takes care of sharing.
The actual code which makes the arrays is also shared, via
`Makers.TYPED_COLLECTORS` (unchanged).

src/java.base/share/classes/java/lang/invoke/MethodHandleImpl.java line 1342:

> 1340:     }
> 1341: 
> 1342:     // Array identity function (used as 
> getConstantHandle(MH_arrayIdentity)).

If this is no longer used, remove it.

src/java.base/share/classes/java/lang/invoke/MethodHandleImpl.java line 1903:

> 1901:     }
> 1902: 
> 1903:     private static LambdaForm makeCollectorForm(MethodType basicType, 
> MethodHandle storeFunc) {

This reads very well.  It is the right way to do this job with LFs.
There is a slight risk that the JIT will fail to inline the tiny MHs
that (a) construct arrays and (b) poke elements into them.
If that happens, then perhaps the bytecode generator can
treat them as intrinsics.  Maybe that's worth a comment in
the code, to help later readers?

-------------

Marked as reviewed by jrose (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/3306

Re: RFR: 8264288: Performance issue with MethodHandle.asCollector

Reply via email to