On Apr 28, 2011, at 9:58 PM, Charles Oliver Nutter wrote: > I'm trying to figure out why polymorphic dispatch is incredibly slow > in JRuby + indy. Take this benchmark, for example: > > class A; def foo; end; end > class B; def foo; end; end > > a = A.new > b = B.new > > 5.times { puts Benchmark.measure { 1000000.times { a, b = b, a; a.foo; > b.foo } } } > > a.foo and b.foo are bimorphic here. Under stock JRuby, using > CachingCallSite, this benchmark runs in about 0.13s per iteration. > Using invokedynamic, it takes 9s!!! > > This is after a patch I just committed that caches the target method > handle for direct paths. I believe the only thing created when GWT > fails now is a new GWT. > > Is it expected that rebinding a call site or constructing a GWT would > be very expensive?
Looking at the compiled methods, it seems so. There is a lot going on when creating a new GWT. > If yes...I will have to look into having a hard > failover to inline caching or a PIC-like handle chain for polymorphic > cases. That's not necessarily difficult. If no...I'm happy to update > my build and play with patches to see what's happening here. > > A sampled profile produced the following output: > > Stub + native Method > 57.6% 0 + 5214 java.lang.invoke.MethodHandleNatives.init > 30.9% 0 + 2798 java.lang.invoke.MethodHandleNatives.init > 2.1% 0 + 189 java.lang.invoke.MethodHandleNatives.getTarget > 0.1% 0 + 7 java.lang.Object.getClass > 0.0% 0 + 3 java.lang.Class.isPrimitive > 0.0% 0 + 3 java.lang.System.arraycopy > 90.7% 0 + 8214 Total stub > > Of course we all know how accurate sampled profiles are, but this is > pretty a pretty dismal result. But that seems to be correct. java.lang.invoke.MethodHandleImpl$GuardWithTest::<init> gets compiled and the inline tree is: 8892 135 java.lang.invoke.MethodHandleImpl$GuardWithTest::<init> (22 bytes) @ 2 java.lang.invoke.BoundMethodHandle::<init> (37 bytes) inline (hot) @ 2 java.lang.invoke.MethodHandle::type (5 bytes) inline (hot) @ 7 java.lang.invoke.MethodType::dropParameterTypes (162 bytes) already compiled into a big method @ 10 java.lang.invoke.MethodHandle::<init> (15 bytes) inline (hot) @ 1 java.lang.Object::<init> (1 bytes) inline (hot) @ 5 java.lang.Object::getClass (0 bytes) (intrinsic) @ 20 java.lang.invoke.MethodHandle::type (5 bytes) inline (hot) @ 24 java.lang.invoke.MethodType::parameterSlotDepth (30 bytes) inline (hot) @ 26 java.lang.invoke.MethodTypeForm::parameterToArgSlot (9 bytes) inline (hot) @ 33 java.lang.invoke.BoundMethodHandle::initTarget (7 bytes) inline (hot) @ 3 java.lang.invoke.MethodHandleNatives::init (0 bytes) native method Obviously that is VERY expensive. -- Christian > > I suspect that this polymorphic cost is a *major* factor in slowing > down some benchmarks under invokedynamic. FWIW, the above benchmark > without the a,b swap runs in 0.06s, better than 2x faster than stock > JRuby (yay!). > > - Charlie > _______________________________________________ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev _______________________________________________ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev