Hi Charlie, Can you send us a decent link or two once it actually does drop. I'm not much of a Ruby head generally, but would like to see the numbers (and, of course, take a quick look at their testing / benching methodology).
Thanks, Ben On Wed, Oct 17, 2012 at 1:53 AM, Charles Oliver Nutter <head...@headius.com> wrote: > Hello all! > > I've recently been informed that a new Ruby implementation is about to > be announced that puts JRuby's numeric perf to shame. Boo hoo. > > It's not like I expected us to retain the numeric crown since we're > still allocating objects for every number in the system, but hopefully > we can get that crown back at some point. > > In an effort to start getting back to indy + perf work (with JRuby 1.7 > almost released, finally), I bring you today's benchmark: > > 50.times { puts Benchmark.measure { f = 20.5; i = 0; while i < > 2000000; f += 0.1; f -= 0.1; f += 0.1; f -= 0.1; f += 0.1; f -= 0.1; f > += 0.1; f -= 0.1; f += 0.1; f -= 0.1; f += 0.1; f -= 0.1; f += 0.1; f > -= 0.1; f += 0.1; f -= 0.1; f += 0.1; f -= 0.1; f += 0.1; f -= 0.1;i > += 1; end } } > > So we have a 2M fixnum loop with ten float adds and ten float > subtracts. Other variations of this have more iterations and fewer > float operations or put the whole loop inside a times{} block. This > version runs in about 0.34s on hotspot-comp + Christian's patches, > which beats Java 7 at 0.39s. If I remove some rarely-followed boolean > logic in the creation of all Ruby objects (including floats) I can get > this down to 0.29s. This is many times faster than almost all the > current Ruby implementations. > > However, this new Ruby impl runs the same code in around 0.1s, so even > with everything inlining JRuby + indy + hotspot-comp + patches is > still 3x slower. I suspect Float allocation is the main bottleneck > here. > > Here's logc output for one of the adds: > > @ 251 java.lang.invoke.LambdaForm$MH::linkToCallSite (18 bytes) > @ 1 java.lang.invoke.Invokers::getCallSiteTarget (8 bytes) > @ 4 java.lang.invoke.MutableCallSite::getTarget (5 bytes) > @ 14 java.lang.invoke.MethodHandle::invokeBasic (0 bytes) > @ 14 java.lang.invoke.LambdaForm$BMH::reinvoke (32 bytes) > @ 13 java.lang.invoke.BoundMethodHandle$Species_LD::reinvokerTarget > (8 bytes) > @ 28 java.lang.invoke.MethodHandle::invokeBasic (0 bytes) > @ 28 java.lang.invoke.LambdaForm$DMH::invokeStatic_LLLD_L (20 bytes) > @ 1 java.lang.invoke.DirectMethodHandle::internalMemberName (8 > bytes) > @ 16 java.lang.invoke.MethodHandle::linkToStatic (0 bytes) > @ 16 org.jruby.runtime.invokedynamic.MathLinker::float_op_plus > (10 bytes) > @ 6 org.jruby.RubyFloat::op_plus (14 bytes) > @ 1 org.jruby.RubyBasicObject::getRuntime (8 bytes) > @ 1 org.jruby.RubyBasicObject::getMetaClass (5 bytes) > @ 4 org.jruby.RubyClass::getClassRuntime (5 bytes) > @ 10 org.jruby.RubyFloat::newFloat (10 bytes) > @ 6 org.jruby.RubyFloat::<init> (15 bytes) > @ 3 org.jruby.Ruby::getFloat (5 bytes) > @ 6 org.jruby.RubyNumeric::<init> (7 bytes) > @ 3 org.jruby.RubyObject::<init> (7 bytes) > @ 3 org.jruby.RubyBasicObject::<init> (30 bytes) > @ 1 java.lang.Object::<init> (1 bytes) > > This is *great*. We're getting all paths inlined, and allocation > inlines all the way up to Object::<init>, so in theory escape analysis > could get rid of this...RIGHT? WRONG!!! > > logc appears to be missing some ouput (either the tool or the > LogCompilation flag are dropping information). The same block of code > from PrintInlining: > > @ 207 > java.lang.invoke.LambdaForm$MH/1942422426::linkToCallSite (18 bytes) > inline (hot) > @ 1 > java.lang.invoke.Invokers::getCallSiteTarget (8 bytes) inline (hot) > @ 4 > java.lang.invoke.MutableCallSite::getTarget (5 bytes) inline (hot) > @ 14 > java.lang.invoke.LambdaForm$MH/1896635336::guard (80 bytes) inline > (hot) > @ 12 java.lang.Class::cast (27 > bytes) inline (hot) > @ 6 java.lang.Class::isInstance (0 > bytes) (intrinsic) > @ 17 > java.lang.invoke.LambdaForm$BMH/1650319731::reinvoke (30 bytes) > inline (hot) > @ 13 > java.lang.invoke.BoundMethodHandle$Species_LL::reinvokerTarget (8 > bytes) inline (hot) > @ 26 > java.lang.invoke.LambdaForm$DMH/842171382::invokeStatic_LL_I (15 > bytes) inline (hot) > @ 1 > java.lang.invoke.DirectMethodHandle::internalMemberName (8 bytes) > inline (hot) > @ 11 > org.jruby.runtime.invokedynamic.MathLinker::floatTest (20 bytes) > inline (hot) > @ 8 > org.jruby.Ruby::isFloatReopened (5 bytes) inline (hot) > @ 50 > java.lang.invoke.LambdaForm$DMH/952682386::invokeSpecial_LLLL_L (20 > bytes) inline (hot) > @ 1 > java.lang.invoke.DirectMethodHandle::internalMemberName (8 bytes) > inline (hot) > @ 16 > java.lang.invoke.LambdaForm$BMH/1698703785::reinvoke (32 bytes) > inline (hot) > @ 13 > java.lang.invoke.BoundMethodHandle$Species_LD::reinvokerTarget (8 > bytes) inline (hot) > @ 28 > java.lang.invoke.LambdaForm$DMH/590335041::invokeStatic_LLLD_L (20 > bytes) inline (hot) > @ 1 > java.lang.invoke.DirectMethodHandle::internalMemberName (8 bytes) > inline (hot) > @ 16 > org.jruby.runtime.invokedynamic.MathLinker::float_op_plus (10 bytes) > inline (hot) > @ 6 > org.jruby.RubyFloat::op_plus (14 bytes) inline (hot) > @ 1 > org.jruby.RubyBasicObject::getRuntime (8 bytes) inline (hot) > @ 1 > org.jruby.RubyBasicObject::getMetaClass (5 bytes) inline (hot) > @ 4 > org.jruby.RubyClass::getClassRuntime (5 bytes) inline (hot) > @ 10 > org.jruby.RubyFloat::newFloat (10 bytes) inline (hot) > @ 6 > org.jruby.RubyFloat::<init> (15 bytes) inline (hot) > @ 3 > org.jruby.Ruby::getFloat (5 bytes) inline (hot) > @ 6 > org.jruby.RubyNumeric::<init> (7 bytes) inline (hot) > @ 3 > org.jruby.RubyObject::<init> (7 bytes) inline (hot) > @ 3 > org.jruby.RubyBasicObject::<init> (30 bytes) inline (hot) > @ 1 > java.lang.Object::<init> (1 bytes) inline (hot) > @ 76 > java.lang.invoke.LambdaForm$DMH/952682386::invokeSpecial_LLLL_L (20 > bytes) call site not reached > > So *almost* everything is inlining, but one path (I believe it's the > failure path from GWT after talking with Christian) is not reached. > Because Hotspot's EA can't do partial EA, any unfollowed paths that > would receive the allocated object have to be considered escapes, and > so anywhere we're doing guarded logic (either in indy or in Java code, > like Fixnum overflow checks) the unfollowed paths prevent EA from > happening. Boo-hoo. > > At this point there's nothing I can really do. I have to guard the > call sites in case we don't see a Float at some point, and for Fixnum > overflow I have to do that boolean check in most cases. There's always > going to be unfollowed paths dangling off the edges of even our > simplest logic. > > Bottom line is that the new indy stuff is starting to really look good > wrt inlining, but EA is still not up to the task of eliding > allocations in the places we need it to. > > Thoughts? > > - Charlie > _______________________________________________ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev _______________________________________________ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev