----- Mail original -----
> De: "Jorn Vernee" <jver...@openjdk.java.net>
> À: "core-libs-dev" <core-libs-dev@openjdk.java.net>
> Envoyé: Mardi 13 Avril 2021 16:59:58
> Objet: Re: RFR: 8263087: Add a MethodHandle combinator that switches over a 
> set of MethodHandles

> On Thu, 8 Apr 2021 18:51:21 GMT, Jorn Vernee <jver...@openjdk.org> wrote:
> 
>> This patch adds a `tableSwitch` combinator that can be used to switch over a 
>> set
>> of method handles given an index, with a fallback in case the index is out of
>> bounds, much like the `tableswitch` bytecode. Here is a description of how it
>> works (copied from the javadoc):
>> 
>>      Creates a table switch method handle, which can be used to switch over 
>> a set of
>>      target
>>      method handles, based on a given target index, called selector.
>> 
>>      For a selector value of {@code n}, where {@code n} falls in the range 
>> {@code [0,
>>      N)},
>>      and where {@code N} is the number of target method handles, the table 
>> switch
>>      method
>>      handle will invoke the n-th target method handle from the list of 
>> target method
>>      handles.
>> 
>>      For a selector value that does not fall in the range {@code [0, N)}, 
>> the table
>>      switch
>>      method handle will invoke the given fallback method handle.
>> 
>>      All method handles passed to this method must have the same type, with 
>> the
>>      additional
>>      requirement that the leading parameter be of type {@code int}. The 
>> leading
>>      parameter
>>      represents the selector.
>> 
>>      Any trailing parameters present in the type will appear on the returned 
>> table
>>      switch
>>      method handle as well. Any arguments assigned to these parameters will 
>> be
>>      forwarded,
>>      together with the selector value, to the selected method handle when 
>> invoking
>>      it.
>> 
>> The combinator does not support specifying the starting index, so the switch
>> cases always run from 0 to however many target handles are specified. A
>> starting index can be added manually with another combination step that 
>> filters
>> the input index by adding or subtracting a constant from it, which does not
>> affect performance. One of the reasons for not supporting a starting index is
>> that it allows for more lambda form sharing, but also simplifies the
>> implementation somewhat. I guess an open question is if a convenience 
>> overload
>> should be added for that case?
>> 
>> Lookup switch can also be simulated by filtering the input through an 
>> injection
>> function that translates it into a case index, which has also proven to have
>> the ability to have comparable performance to, or even better performance 
>> than,
>> a bytecode-native `lookupswitch` instruction. I plan to add such an injection
>> function to the runtime libraries in the future as well. Maybe at that point 
>> it
>> could be evaluated if it's worth it to add a lookup switch combinator as 
>> well,
>> but I don't see an immediate need to include it in this patch.
>> 
>> The current bytecode intrinsification generates a call for each switch case,
>> which guarantees full inlining of the target method handles. Alternatively we
>> could only have 1 callsite at the end of the switch, where each case just 
>> loads
>> the target method handle, but currently this does not allow for inlining of 
>> the
>> handles, since they are not constant.
>> 
>> Maybe a future C2 optimization could look at the receiver input for 
>> invokeBasic
>> call sites, and if the input is a phi node, clone the call for each constant
>> input of the phi. I believe that would allow simplifying the bytecode without
>> giving up on inlining.
>> 
>> Some numbers from the added benchmarks:
>> 
>> Benchmark                                        (numCases)  (offset)  
>> (sorted)
>> Mode  Cnt   Score   Error  Units
>> MethodHandlesTableSwitchConstant.testSwitch               5         0       
>> N/A
>> avgt   30   4.186 � 0.054  ms/op
>> MethodHandlesTableSwitchConstant.testSwitch               5       150       
>> N/A
>> avgt   30   4.164 � 0.057  ms/op
>> MethodHandlesTableSwitchConstant.testSwitch              10         0       
>> N/A
>> avgt   30   4.124 � 0.023  ms/op
>> MethodHandlesTableSwitchConstant.testSwitch              10       150       
>> N/A
>> avgt   30   4.126 � 0.025  ms/op
>> MethodHandlesTableSwitchConstant.testSwitch              25         0       
>> N/A
>> avgt   30   4.137 � 0.042  ms/op
>> MethodHandlesTableSwitchConstant.testSwitch              25       150       
>> N/A
>> avgt   30   4.113 � 0.016  ms/op
>> MethodHandlesTableSwitchConstant.testSwitch              50         0       
>> N/A
>> avgt   30   4.118 � 0.028  ms/op
>> MethodHandlesTableSwitchConstant.testSwitch              50       150       
>> N/A
>> avgt   30   4.127 � 0.019  ms/op
>> MethodHandlesTableSwitchConstant.testSwitch             100         0       
>> N/A
>> avgt   30   4.116 � 0.013  ms/op
>> MethodHandlesTableSwitchConstant.testSwitch             100       150       
>> N/A
>> avgt   30   4.121 � 0.020  ms/op
>> MethodHandlesTableSwitchOpaqueSingle.testSwitch           5         0       
>> N/A
>> avgt   30   4.113 � 0.009  ms/op
>> MethodHandlesTableSwitchOpaqueSingle.testSwitch           5       150       
>> N/A
>> avgt   30   4.149 � 0.041  ms/op
>> MethodHandlesTableSwitchOpaqueSingle.testSwitch          10         0       
>> N/A
>> avgt   30   4.121 � 0.026  ms/op
>> MethodHandlesTableSwitchOpaqueSingle.testSwitch          10       150       
>> N/A
>> avgt   30   4.113 � 0.021  ms/op
>> MethodHandlesTableSwitchOpaqueSingle.testSwitch          25         0       
>> N/A
>> avgt   30   4.129 � 0.028  ms/op
>> MethodHandlesTableSwitchOpaqueSingle.testSwitch          25       150       
>> N/A
>> avgt   30   4.105 � 0.019  ms/op
>> MethodHandlesTableSwitchOpaqueSingle.testSwitch          50         0       
>> N/A
>> avgt   30   4.097 � 0.021  ms/op
>> MethodHandlesTableSwitchOpaqueSingle.testSwitch          50       150       
>> N/A
>> avgt   30   4.131 � 0.037  ms/op
>> MethodHandlesTableSwitchOpaqueSingle.testSwitch         100         0       
>> N/A
>> avgt   30   4.135 � 0.025  ms/op
>> MethodHandlesTableSwitchOpaqueSingle.testSwitch         100       150       
>> N/A
>> avgt   30   4.139 � 0.145  ms/op
>> MethodHandlesTableSwitchRandom.testSwitch                 5         0      
>> true
>> avgt   30   4.894 � 0.028  ms/op
>> MethodHandlesTableSwitchRandom.testSwitch                 5         0     
>> false
>> avgt   30  11.526 � 0.194  ms/op
>> MethodHandlesTableSwitchRandom.testSwitch                 5       150      
>> true
>> avgt   30   4.882 � 0.025  ms/op
>> MethodHandlesTableSwitchRandom.testSwitch                 5       150     
>> false
>> avgt   30  11.532 � 0.034  ms/op
>> MethodHandlesTableSwitchRandom.testSwitch                10         0      
>> true
>> avgt   30   5.065 � 0.076  ms/op
>> MethodHandlesTableSwitchRandom.testSwitch                10         0     
>> false
>> avgt   30  13.016 � 0.020  ms/op
>> MethodHandlesTableSwitchRandom.testSwitch                10       150      
>> true
>> avgt   30   5.103 � 0.051  ms/op
>> MethodHandlesTableSwitchRandom.testSwitch                10       150     
>> false
>> avgt   30  12.984 � 0.102  ms/op
>> MethodHandlesTableSwitchRandom.testSwitch                25         0      
>> true
>> avgt   30   8.441 � 0.165  ms/op
>> MethodHandlesTableSwitchRandom.testSwitch                25         0     
>> false
>> avgt   30  13.371 � 0.060  ms/op
>> MethodHandlesTableSwitchRandom.testSwitch                25       150      
>> true
>> avgt   30   8.628 � 0.032  ms/op
>> MethodHandlesTableSwitchRandom.testSwitch                25       150     
>> false
>> avgt   30  13.542 � 0.020  ms/op
>> MethodHandlesTableSwitchRandom.testSwitch                50         0      
>> true
>> avgt   30   4.701 � 0.015  ms/op
>> MethodHandlesTableSwitchRandom.testSwitch                50         0     
>> false
>> avgt   30  13.562 � 0.063  ms/op
>> MethodHandlesTableSwitchRandom.testSwitch                50       150      
>> true
>> avgt   30   7.991 � 3.111  ms/op
>> MethodHandlesTableSwitchRandom.testSwitch                50       150     
>> false
>> avgt   30  13.543 � 0.088  ms/op
>> MethodHandlesTableSwitchRandom.testSwitch               100         0      
>> true
>> avgt   30   4.712 � 0.020  ms/op
>> MethodHandlesTableSwitchRandom.testSwitch               100         0     
>> false
>> avgt   30  13.600 � 0.085  ms/op
>> MethodHandlesTableSwitchRandom.testSwitch               100       150      
>> true
>> avgt   30   4.676 � 0.011  ms/op
>> MethodHandlesTableSwitchRandom.testSwitch               100       150     
>> false
>> avgt   30  13.476 � 0.043  ms/op
>> 
>> 
>> Testing:
>> - [x] Running of included benchmarks
>> - [x] Inspecting inlining trace and verifying method handle targets are 
>> inlined
>> - [x] Running TestTableSwitch test (currently the only user of the new code)
>> - [x] Running java/lang/invoke tests (just in case)
>> - [x] Some manual testing
>> 
>> Thanks,
>> Jorn
> 
>> you have two strategy to de-sugar a switch,
>> 
>> * if what you do after the case do not mutate any variables, you can desugar
>> each case to a method more or less like a lambda (it's not exactly like a
>> lambda because there is no capture) and you have an indy in front that will
>> call the right method handles
>> 
>> * you have a front end, with an indy that associate the object to an int and 
>> a
>> backend which is tableswitch in the bytecode
>> 
>> ...
>>
>> The tests above are using the first strategy
> 
> No, they are using the second strategy. The SwitchBootstraps patch I linked to
> replaces the front end `lookupswitch` of a String switch with an
> `invokedynamic` that computes an index for the back end jump table, which is
> still a `tableswitch` in the bytecode.
> 
> As John also described, a hypothetical lookupSwitch combinator can be emulated
> by using a `k -> [0, N)` projection that feeds into the tableSwitch combinator
> that is proposed by this PR. The point of the examples I linked to was to show
> several flavors of projection functions as an example of how this could be
> implemented, and to show that they have competitive performance with a native
> `lookupswitch` instruction (the 'legacy' case). i.e. the benchmarks show the
> difference between `lookupswitch` implemented in bytecode, and a `k -> [0, N)`
> projection function built by an `invokedynamic`. (sorry, I should have offered
> more explanation in the first place)
> 
> The combinator added by _this_ PR is not meant to replace any part of the 
> String
> switch translation. For pattern switch the `tableSwitch` combinator _could_ be
> used to implement the front end `k -> [0, N)` projection, but it is not
> strictly required. Either way, that seems orthogonal to this PR.

I agree this is orthogonal and we can continue that discussion without blocking 
this PR.

About your benchmark, did you test with some strings going into "default", 
because it is usually in that case that you need a proper lookup switch,
another way to say it is that, your results are too good when you use a cascade 
of guardWithTest.


> 
> -------------
> 
> PR: https://git.openjdk.java.net/jdk/pull/3401

Rémi

Reply via email to