----- Mail original ----- > De: "Jorn Vernee" <jver...@openjdk.java.net> > À: "core-libs-dev" <core-libs-dev@openjdk.java.net> > Envoyé: Vendredi 9 Avril 2021 12:51:53 > Objet: RFR: 8263087: Add a MethodHandle combinator that switches over a set > of MethodHandles
> This patch adds a `tableSwitch` combinator that can be used to switch over a > set > of method handles given an index, with a fallback in case the index is out of > bounds, much like the `tableswitch` bytecode. > > The combinator does not support specifying the starting index, so the switch > cases always run from 0 to however many target handles are specified. A > starting index can be added manually with another combination step that > filters > the input index by adding or subtracting a constant from it, which does not > affect performance. One of the reasons for not supporting a starting index is > that it allows for more lambda form sharing, but also simplifies the > implementation somewhat. I guess an open question is if a convenience overload > should be added for that case? I think the combinator should be lookupswitch which is more general than tableswitch with a special case when generating the bytecode to generate a tableswitch instead of a lookupswitch if the indexes are subsequent. > > Lookup switch can also be simulated by filtering the input through an > injection > function that translates it into a case index, which has also proven to have > the ability to have comparable performance to, or even better performance > than, > a bytecode-native `lookupswitch` instruction. I plan to add such an injection > function to the runtime libraries in the future as well. Maybe at that point > it > could be evaluated if it's worth it to add a lookup switch combinator as well, > but I don't see an immediate need to include it in this patch. > As i said in the bug when we discuss about that the filtering function, i believe that the filtering function for emulating lookupswitch is lookupswitch itself. > The current bytecode intrinsification generates a call for each switch case, > which guarantees full inlining of the target method handles. Alternatively we > could only have 1 callsite at the end of the switch, where each case just > loads > the target method handle, but currently this does not allow for inlining of > the > handles, since they are not constant. This scheme also allows to never JIT compile a branch which is never used. > > Maybe a future C2 optimization could look at the receiver input for > invokeBasic > call sites, and if the input is a phi node, clone the call for each constant > input of the phi. I believe that would allow simplifying the bytecode without > giving up on inlining. > > Some numbers from the added benchmarks: > Benchmark (numCases) (offset) > (sorted) > Mode Cnt Score Error Units > MethodHandlesTableSwitchConstant.testSwitch 5 0 > N/A > avgt 30 4.186 � 0.054 ms/op > MethodHandlesTableSwitchConstant.testSwitch 5 150 > N/A > avgt 30 4.164 � 0.057 ms/op > MethodHandlesTableSwitchConstant.testSwitch 10 0 > N/A > avgt 30 4.124 � 0.023 ms/op > MethodHandlesTableSwitchConstant.testSwitch 10 150 > N/A > avgt 30 4.126 � 0.025 ms/op > MethodHandlesTableSwitchConstant.testSwitch 25 0 > N/A > avgt 30 4.137 � 0.042 ms/op > MethodHandlesTableSwitchConstant.testSwitch 25 150 > N/A > avgt 30 4.113 � 0.016 ms/op > MethodHandlesTableSwitchConstant.testSwitch 50 0 > N/A > avgt 30 4.118 � 0.028 ms/op > MethodHandlesTableSwitchConstant.testSwitch 50 150 > N/A > avgt 30 4.127 � 0.019 ms/op > MethodHandlesTableSwitchConstant.testSwitch 100 0 > N/A > avgt 30 4.116 � 0.013 ms/op > MethodHandlesTableSwitchConstant.testSwitch 100 150 > N/A > avgt 30 4.121 � 0.020 ms/op > MethodHandlesTableSwitchOpaqueSingle.testSwitch 5 0 > N/A > avgt 30 4.113 � 0.009 ms/op > MethodHandlesTableSwitchOpaqueSingle.testSwitch 5 150 > N/A > avgt 30 4.149 � 0.041 ms/op > MethodHandlesTableSwitchOpaqueSingle.testSwitch 10 0 > N/A > avgt 30 4.121 � 0.026 ms/op > MethodHandlesTableSwitchOpaqueSingle.testSwitch 10 150 > N/A > avgt 30 4.113 � 0.021 ms/op > MethodHandlesTableSwitchOpaqueSingle.testSwitch 25 0 > N/A > avgt 30 4.129 � 0.028 ms/op > MethodHandlesTableSwitchOpaqueSingle.testSwitch 25 150 > N/A > avgt 30 4.105 � 0.019 ms/op > MethodHandlesTableSwitchOpaqueSingle.testSwitch 50 0 > N/A > avgt 30 4.097 � 0.021 ms/op > MethodHandlesTableSwitchOpaqueSingle.testSwitch 50 150 > N/A > avgt 30 4.131 � 0.037 ms/op > MethodHandlesTableSwitchOpaqueSingle.testSwitch 100 0 > N/A > avgt 30 4.135 � 0.025 ms/op > MethodHandlesTableSwitchOpaqueSingle.testSwitch 100 150 > N/A > avgt 30 4.139 � 0.145 ms/op > MethodHandlesTableSwitchRandom.testSwitch 5 0 > true > avgt 30 4.894 � 0.028 ms/op > MethodHandlesTableSwitchRandom.testSwitch 5 0 > false > avgt 30 11.526 � 0.194 ms/op > MethodHandlesTableSwitchRandom.testSwitch 5 150 > true > avgt 30 4.882 � 0.025 ms/op > MethodHandlesTableSwitchRandom.testSwitch 5 150 > false > avgt 30 11.532 � 0.034 ms/op > MethodHandlesTableSwitchRandom.testSwitch 10 0 > true > avgt 30 5.065 � 0.076 ms/op > MethodHandlesTableSwitchRandom.testSwitch 10 0 > false > avgt 30 13.016 � 0.020 ms/op > MethodHandlesTableSwitchRandom.testSwitch 10 150 > true > avgt 30 5.103 � 0.051 ms/op > MethodHandlesTableSwitchRandom.testSwitch 10 150 > false > avgt 30 12.984 � 0.102 ms/op > MethodHandlesTableSwitchRandom.testSwitch 25 0 > true > avgt 30 8.441 � 0.165 ms/op > MethodHandlesTableSwitchRandom.testSwitch 25 0 > false > avgt 30 13.371 � 0.060 ms/op > MethodHandlesTableSwitchRandom.testSwitch 25 150 > true > avgt 30 8.628 � 0.032 ms/op > MethodHandlesTableSwitchRandom.testSwitch 25 150 > false > avgt 30 13.542 � 0.020 ms/op > MethodHandlesTableSwitchRandom.testSwitch 50 0 > true > avgt 30 4.701 � 0.015 ms/op > MethodHandlesTableSwitchRandom.testSwitch 50 0 > false > avgt 30 13.562 � 0.063 ms/op > MethodHandlesTableSwitchRandom.testSwitch 50 150 > true > avgt 30 7.991 � 3.111 ms/op > MethodHandlesTableSwitchRandom.testSwitch 50 150 > false > avgt 30 13.543 � 0.088 ms/op > MethodHandlesTableSwitchRandom.testSwitch 100 0 > true > avgt 30 4.712 � 0.020 ms/op > MethodHandlesTableSwitchRandom.testSwitch 100 0 > false > avgt 30 13.600 � 0.085 ms/op > MethodHandlesTableSwitchRandom.testSwitch 100 150 > true > avgt 30 4.676 � 0.011 ms/op > MethodHandlesTableSwitchRandom.testSwitch 100 150 > false > avgt 30 13.476 � 0.043 ms/op > > Testing: > - [x] Running of included benchmarks > - [x] Inspecting inlining trace and verifying method handle targets are > inlined > - [x] Running TestTableSwitch test (currently the only user of the new code) > - [x] Running java/lang/invoke tests (just in case) > - [x] Some manual testing > > Thanks, > Jorn > > ------------- > > Commit messages: > - Improve test > - Touchup > - Use cases array + holder > - WIP - implement tableSwitch combinator in lambda form interpreter > > Changes: https://git.openjdk.java.net/jdk/pull/3401/files > Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3401&range=00 > Issue: https://bugs.openjdk.java.net/browse/JDK-8263087 > Stats: 959 lines in 8 files changed: 955 ins; 0 del; 4 mod > Patch: https://git.openjdk.java.net/jdk/pull/3401.diff > Fetch: git fetch https://git.openjdk.java.net/jdk pull/3401/head:pull/3401 > > PR: https://git.openjdk.java.net/jdk/pull/3401