On Jul 28, 2011, at 2:20 AM, Ola Bini wrote: > Hi, > > I've hit a very annoying performance problem with invoke dynamic/method > handles that makes certain benchmarks about 3 times slower for identical > operations. This code is related to to variable lookup and the basic > idea is that I have a LexicalScope class which contains a parent > pointer. It has a LexicalScope.One subclass that extends LexicalScope, a > LexicalScope.Two that extends LexicalScope.One, etc, and there is a > field on each of them that contains that indexed variable. > > At compile time, I know what lexical depth and index a variable maps to. > The original code generates straight bytecode for this. My benchmarks > (depending on depth and breadth of the lexical scope) goes between 2.1s > to 4.1s. The byte code just does this: > get the current scope > get the parent of the scope (by repeatedly getting the parent field) > cast to the specific scope size we are interested in > get the field for the index we are interested in > do regular return/invocation on this value (this is the same process as > the other call paths, so should be fine). > > However, when I try to do the same thing with MethodHandles, the best I > can get it to do is 8.1s to 15s, which is pretty terrible (it was even > worse before I stopped using methodhandles directly to fields. > MethodHandles to a getter method gave me 10%). > > The actual method handle creation looks a bit like this: > > MethodHandle current = identity(LexicalScope.class); > > int currentDepth = lexicalDepth; > while(currentDepth-- > 0) { > current = filterArguments(current, 0, PARENT_SCOPE_METHOD); > } > > MethodHandle valueMH = null; > switch(lexicalIndex) { > case 0: > valueMH = filterArguments(SCOPE_0_GETTER_M, 0, current); > break; > case 1: > valueMH = filterArguments(SCOPE_1_GETTER_M, 0, current); > break; > case 2: > valueMH = filterArguments(SCOPE_2_GETTER_M, 0, current); > break; > case 3: > valueMH = filterArguments(SCOPE_3_GETTER_M, 0, current); > break; > case 4: > valueMH = filterArguments(SCOPE_4_GETTER_M, 0, current); > break; > case 5: > valueMH = filterArguments(SCOPE_5_GETTER_M, 0, current); > break; > default: > valueMH = filterArguments(insertArguments(SCOPE_N_GETTER_M, > 0, lexicalIndex-6), 0, current); > break; > } > > > The rest just applies the same method handles for invocation/return as > the rest of the call site is using. > SCOPE_2_GETTER_M is defined as > findVirtual(LexicalScope.Three.class, "getValueThree", > methodType(SephObject.class)).asType(SCOPE_GETTER_M_TYPE) > where getValueThree is just a final getter method. > > I tried switching out asType to explicitCastArguments. That ended up > being about 5% slower. I tried removing the asType by defining all the > methods on LexicalScope and overriding them (which in practice would > never call the base method). This didn't give any performance change at all. > > So now I'm a bit lost - I have no idea why this is so much slower than > the explicit bytecode. Any thoughts? My next attack will be to go and > compare the assembler.
The bad performance sounds like something is not inlined at all. How are you invoking valueMH? Via invokedynamic or a direct MH call? -- Christian > > Cheers > -- > Ola Bini (http://olabini.com) > Ioke - JRuby - ThoughtWorks > > "Yields falsehood when quined" yields falsehood when quined. > > _______________________________________________ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev _______________________________________________ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev