On Thu, 8 Apr 2021 18:51:21 GMT, Jorn Vernee <jver...@openjdk.org> wrote:

> This patch adds a `tableSwitch` combinator that can be used to switch over a 
> set of method handles given an index, with a fallback in case the index is 
> out of bounds, much like the `tableswitch` bytecode. Here is a description of 
> how it works (copied from the javadoc):
> 
>      Creates a table switch method handle, which can be used to switch over a 
> set of target
>      method handles, based on a given target index, called selector.
> 
>      For a selector value of {@code n}, where {@code n} falls in the range 
> {@code [0, N)},
>      and where {@code N} is the number of target method handles, the table 
> switch method
>      handle will invoke the n-th target method handle from the list of target 
> method handles.
> 
>      For a selector value that does not fall in the range {@code [0, N)}, the 
> table switch
>      method handle will invoke the given fallback method handle.
> 
>      All method handles passed to this method must have the same type, with 
> the additional
>      requirement that the leading parameter be of type {@code int}. The 
> leading parameter
>      represents the selector.
> 
>      Any trailing parameters present in the type will appear on the returned 
> table switch
>      method handle as well. Any arguments assigned to these parameters will 
> be forwarded,
>      together with the selector value, to the selected method handle when 
> invoking it.
> 
> The combinator does not support specifying the starting index, so the switch 
> cases always run from 0 to however many target handles are specified. A 
> starting index can be added manually with another combination step that 
> filters the input index by adding or subtracting a constant from it, which 
> does not affect performance. One of the reasons for not supporting a starting 
> index is that it allows for more lambda form sharing, but also simplifies the 
> implementation somewhat. I guess an open question is if a convenience 
> overload should be added for that case?
> 
> Lookup switch can also be simulated by filtering the input through an 
> injection function that translates it into a case index, which has also 
> proven to have the ability to have comparable performance to, or even better 
> performance than, a bytecode-native `lookupswitch` instruction. I plan to add 
> such an injection function to the runtime libraries in the future as well. 
> Maybe at that point it could be evaluated if it's worth it to add a lookup 
> switch combinator as well, but I don't see an immediate need to include it in 
> this patch.
> 
> The current bytecode intrinsification generates a call for each switch case, 
> which guarantees full inlining of the target method handles. Alternatively we 
> could only have 1 callsite at the end of the switch, where each case just 
> loads the target method handle, but currently this does not allow for 
> inlining of the handles, since they are not constant.
> 
> Maybe a future C2 optimization could look at the receiver input for 
> invokeBasic call sites, and if the input is a phi node, clone the call for 
> each constant input of the phi. I believe that would allow simplifying the 
> bytecode without giving up on inlining.
> 
> Some numbers from the added benchmarks:
> 
> Benchmark                                        (numCases)  (offset)  
> (sorted)  Mode  Cnt   Score   Error  Units
> MethodHandlesTableSwitchConstant.testSwitch               5         0       
> N/A  avgt   30   4.186 � 0.054  ms/op
> MethodHandlesTableSwitchConstant.testSwitch               5       150       
> N/A  avgt   30   4.164 � 0.057  ms/op
> MethodHandlesTableSwitchConstant.testSwitch              10         0       
> N/A  avgt   30   4.124 � 0.023  ms/op
> MethodHandlesTableSwitchConstant.testSwitch              10       150       
> N/A  avgt   30   4.126 � 0.025  ms/op
> MethodHandlesTableSwitchConstant.testSwitch              25         0       
> N/A  avgt   30   4.137 � 0.042  ms/op
> MethodHandlesTableSwitchConstant.testSwitch              25       150       
> N/A  avgt   30   4.113 � 0.016  ms/op
> MethodHandlesTableSwitchConstant.testSwitch              50         0       
> N/A  avgt   30   4.118 � 0.028  ms/op
> MethodHandlesTableSwitchConstant.testSwitch              50       150       
> N/A  avgt   30   4.127 � 0.019  ms/op
> MethodHandlesTableSwitchConstant.testSwitch             100         0       
> N/A  avgt   30   4.116 � 0.013  ms/op
> MethodHandlesTableSwitchConstant.testSwitch             100       150       
> N/A  avgt   30   4.121 � 0.020  ms/op
> MethodHandlesTableSwitchOpaqueSingle.testSwitch           5         0       
> N/A  avgt   30   4.113 � 0.009  ms/op
> MethodHandlesTableSwitchOpaqueSingle.testSwitch           5       150       
> N/A  avgt   30   4.149 � 0.041  ms/op
> MethodHandlesTableSwitchOpaqueSingle.testSwitch          10         0       
> N/A  avgt   30   4.121 � 0.026  ms/op
> MethodHandlesTableSwitchOpaqueSingle.testSwitch          10       150       
> N/A  avgt   30   4.113 � 0.021  ms/op
> MethodHandlesTableSwitchOpaqueSingle.testSwitch          25         0       
> N/A  avgt   30   4.129 � 0.028  ms/op
> MethodHandlesTableSwitchOpaqueSingle.testSwitch          25       150       
> N/A  avgt   30   4.105 � 0.019  ms/op
> MethodHandlesTableSwitchOpaqueSingle.testSwitch          50         0       
> N/A  avgt   30   4.097 � 0.021  ms/op
> MethodHandlesTableSwitchOpaqueSingle.testSwitch          50       150       
> N/A  avgt   30   4.131 � 0.037  ms/op
> MethodHandlesTableSwitchOpaqueSingle.testSwitch         100         0       
> N/A  avgt   30   4.135 � 0.025  ms/op
> MethodHandlesTableSwitchOpaqueSingle.testSwitch         100       150       
> N/A  avgt   30   4.139 � 0.145  ms/op
> MethodHandlesTableSwitchRandom.testSwitch                 5         0      
> true  avgt   30   4.894 � 0.028  ms/op
> MethodHandlesTableSwitchRandom.testSwitch                 5         0     
> false  avgt   30  11.526 � 0.194  ms/op
> MethodHandlesTableSwitchRandom.testSwitch                 5       150      
> true  avgt   30   4.882 � 0.025  ms/op
> MethodHandlesTableSwitchRandom.testSwitch                 5       150     
> false  avgt   30  11.532 � 0.034  ms/op
> MethodHandlesTableSwitchRandom.testSwitch                10         0      
> true  avgt   30   5.065 � 0.076  ms/op
> MethodHandlesTableSwitchRandom.testSwitch                10         0     
> false  avgt   30  13.016 � 0.020  ms/op
> MethodHandlesTableSwitchRandom.testSwitch                10       150      
> true  avgt   30   5.103 � 0.051  ms/op
> MethodHandlesTableSwitchRandom.testSwitch                10       150     
> false  avgt   30  12.984 � 0.102  ms/op
> MethodHandlesTableSwitchRandom.testSwitch                25         0      
> true  avgt   30   8.441 � 0.165  ms/op
> MethodHandlesTableSwitchRandom.testSwitch                25         0     
> false  avgt   30  13.371 � 0.060  ms/op
> MethodHandlesTableSwitchRandom.testSwitch                25       150      
> true  avgt   30   8.628 � 0.032  ms/op
> MethodHandlesTableSwitchRandom.testSwitch                25       150     
> false  avgt   30  13.542 � 0.020  ms/op
> MethodHandlesTableSwitchRandom.testSwitch                50         0      
> true  avgt   30   4.701 � 0.015  ms/op
> MethodHandlesTableSwitchRandom.testSwitch                50         0     
> false  avgt   30  13.562 � 0.063  ms/op
> MethodHandlesTableSwitchRandom.testSwitch                50       150      
> true  avgt   30   7.991 � 3.111  ms/op
> MethodHandlesTableSwitchRandom.testSwitch                50       150     
> false  avgt   30  13.543 � 0.088  ms/op
> MethodHandlesTableSwitchRandom.testSwitch               100         0      
> true  avgt   30   4.712 � 0.020  ms/op
> MethodHandlesTableSwitchRandom.testSwitch               100         0     
> false  avgt   30  13.600 � 0.085  ms/op
> MethodHandlesTableSwitchRandom.testSwitch               100       150      
> true  avgt   30   4.676 � 0.011  ms/op
> MethodHandlesTableSwitchRandom.testSwitch               100       150     
> false  avgt   30  13.476 � 0.043  ms/op
> 
> 
> Testing:
> - [x] Running of included benchmarks
> - [x] Inspecting inlining trace and verifying method handle targets are 
> inlined
> - [x] Running TestTableSwitch test (currently the only user of the new code)
> - [x] Running java/lang/invoke tests (just in case)
> - [x] Some manual testing
> 
> Thanks,
> Jorn

Overall this looks great! 

I have a number of nits and a request to try and remodel so that 
`NamedFunction` isn't responsible for holding the arbitrary intrinsic data that 
you want to add here.

src/java.base/share/classes/java/lang/invoke/InvokerBytecodeGenerator.java line 
1404:

> 1402:     private Name emitTableSwitch(int pos, int numCases) {
> 1403:         Name args    = lambdaForm.names[pos];
> 1404:         Name invoker = lambdaForm.names[pos+1];

Add spaces around `+`

src/java.base/share/classes/java/lang/invoke/InvokerBytecodeGenerator.java line 
1409:

> 1407:         Class<?> returnType = 
> result.function.resolvedHandle().type().returnType();
> 1408:         MethodType caseType = args.function.resolvedHandle().type()
> 1409:             .dropParameterTypes(0,1) // drop collector

Space after comma

src/java.base/share/classes/java/lang/invoke/LambdaForm.java line 731:

> 729:                 collectArgs.isInvokeBasic() &&
> 730:                 unboxResult.isInvokeBasic() &&
> 731:                 tableSwitch.lastUseIndex(collectArgs)  == 3 &&    // 
> t_{n+1}:L=MethodHandleImpl.<invoker>(*, *, *, t_{n});

Extraneous space

src/java.base/share/classes/java/lang/invoke/LambdaForm.java line 1098:

> 1096:         @Stable MethodHandle invoker;
> 1097:         private final MethodHandleImpl.Intrinsic intrinsicName;
> 1098:         private final Object intrinsicData;

This seems like the wrong place to add arbitrary data related only to 
intrinsics. Could this be moved either into 
`MethodHandleImpl$IntrinsicMethodHandle` or - perhaps less appealingly - 
rewrite the `MethodHandleImpl.Intrinsic` to a regular class which can then be 
instantiated with extra data? That `NamedFunction` holds a field of type 
`MethodHandleImpl.Intrinsic` instead of delegating to 
`resolvedHandle.intrinsicName()` seem like a pre-existing kludge that would be 
good to sort out why/if it's really necessary.

src/java.base/share/classes/java/lang/invoke/MethodHandleImpl.java line 2015:

> 2013:     }
> 2014: 
> 2015:     private static final Map<TableSwitchCacheKey, LambdaForm> 
> TABLE_SWITCH_LAMBDA_FORM_CACHE = new ConcurrentHashMap<>();

This could be moved to `class TableSwitchCacheKey` to reduce eagerly 
initialization on bootstrap.

test/micro/org/openjdk/bench/java/lang/invoke/MethodHandlesTableSwitchOpaqueSingle.java
 line 90:

> 88:         "50",
> 89:         "100"
> 90:     })

To reduce default runtime I think it'd be good to slim down the number of sizes 
we test here to 2 or 3 different ones.

test/micro/org/openjdk/bench/java/lang/invoke/MethodHandlesTableSwitchRandom.java
 line 90:

> 88:         "50",
> 89:         "100"
> 90:     })

To reduce default runtime I think it'd be good to slim down the number of sizes 
we test here to 2 or 3 different ones.

-------------

Changes requested by redestad (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/3401

Reply via email to