On May 21, 2011, at 1:31 AM, Charles Oliver Nutter wrote: > I am still using the specialized math callsites because they inline well and > an Indy-based approach was still quite a bit slower last time I tried. > > I am back to being worried about the inlining thresholds. With no MHs and > runtime profiled direct calls, I was able to get this all to inline. So > basically the fact that I am using Indy and MHs now penalizes me over not > just profiled direct calls, but over a totally generic inline cache that > usually doesn't inline at all. That's pretty bad :)
I was on vacation for a couple of days but now I'm back. I looked at the inlining tree for fib_recursive and I can't see the fib_ruby method getting compiled. But instead I see something I've never seen before: 2949 105 org.jruby.ast.CallOneArgNode::interpret (34 bytes) or 2997 108 ! org.jruby.evaluator.ASTInterpreter::INTERPRET_METHOD (126 bytes) Does that mean JRuby doesn't compile these methods but interpret them? -- Christian > > I can try fiddling with the thresholds later, but do we agree that an > invokedynamic bound to a GWT plus some intermediate adapter handles ending in > a direct handle should inline as well as e.g. an invokevirtual? > > - Charlie (mobile) > > On May 20, 2011, at 17:35, Rémi Forax <fo...@univ-mlv.fr> wrote: > >> On 05/20/2011 11:08 PM, Tom Rodriguez wrote: >>> Is it the inlining policy? I played a little with this and the current >>> default policy is still not as good as the big tweaks to MaxInlineSize and >>> InlineSmallCode. >>> >>> tom >> >> I tend to agree, >> things like isFixnumReopened and getCacheToken seem to be not inlined. >> >> Charles, do you have tested if the inline flags are tweaked ? >> >> Also I do not understand why you have JRuby's callsites like MinusCallSite >> in the stacktrace. You should not use it if invokedynamic is enabled. >> >> Rémi >> >>> On May 20, 2011, at 1:21 PM, Charles Oliver Nutter wrote: >>> >>>> Meh, false alarm. This is just the default empty Object[] assigned to >>>> all objects. I still can't explain the perf degradation. >>>> >>>> Here's the full assembly for fib_ruby: https://gist.github.com/983721 >>>> >>>> It seems like stuff is inlining, so I have no idea why performance is >>>> so terrible. >>>> >>>> - Charlie >>>> >>>> On Fri, May 20, 2011 at 2:59 PM, Charles Oliver Nutter >>>> <head...@headius.com> wrote: >>>>> Another update...I managed to turn off the ricochet frames and have it >>>>> still work (???). The with-ricochet mode is faster, but still 2-3x >>>>> slower than no indy at all (using JRuby's IC), and 4-5x slower than >>>>> invokedynamic was recently... >>>>> >>>>> HOWEVER...an assembly dump I just did contains some very odd code that >>>>> might indicate a problem in my code. Hold the presses on digging into >>>>> it on your end until I can explain this. (this = apparently standing >>>>> up an instance variable table for Fixnum objects...*definitely* should >>>>> not be happening). >>>>> >>>>> 0x02868f2a: mov [eax+0x1C], ebx ;*invokespecial<init> >>>>> ; - >>>>> org.jruby.RubyBasicObject::<init>@1 (line 218) >>>>> ; - >>>>> org.jruby.RubyObject::<init>@2 (line 116) >>>>> ; - >>>>> org.jruby.RubyNumeric::<init>@2 (line 114) >>>>> ; - >>>>> org.jruby.RubyInteger::<init>@2 (line 95) >>>>> ; - >>>>> org.jruby.RubyFixnum::<init>@5 (line 112) >>>>> ; - >>>>> org.jruby.RubyFixnum::newFixnum@25 (line 173) >>>>> ; - >>>>> org.jruby.RubyFixnum::op_minus@34 (line 395) >>>>> ; - >>>>> org.jruby.runtime.callsite.MinusCallSite::call@24 (line 16) >>>>> 0x02868f2d: mov ebx, [esp+0x14] >>>>> 0x02868f31: mov ecx, [ebx+0xBC] >>>>> 0x02868f37: mov edx, eax >>>>> 0x02868f39: shr edx, 9 >>>>> 0x02868f3c: mov [edx+0x394000], 0x00 ;*putfield varTable >>>>> ; - >>>>> org.jruby.RubyBasicObject::<init>@8 (line 84) >>>>> ; - >>>>> org.jruby.RubyObject::<init>@2 (line 116) >>>>> ; - >>>>> org.jruby.RubyNumeric::<init>@2 (line 114) >>>>> ; - >>>>> org.jruby.RubyInteger::<init>@2 (line 95) >>>>> ; - >>>>> org.jruby.RubyFixnum::<init>@5 (line 112) >>>>> ; - >>>>> org.jruby.RubyFixnum::newFixnum@25 (line 173) >>>>> ; - >>>>> org.jruby.RubyFixnum::op_minus@34 (line 395) >>>>> ; - >>>>> org.jruby.runtime.callsite.MinusCallSite::call@24 (line 16) >>>>> 0x02868f43: mov edx, a 'java/lang/Class' = >>>>> 'org/jruby/RubyBasicObject' >>>>> ; {oop(a 'java/lang/Class' = >>>>> 'org/jruby/RubyBasicObject')} >>>>> 0x02868f48: mov ebx, [edx+0x68] ;*getstatic NULL_OBJECT_ARRAY >>>>> ; - >>>>> org.jruby.RubyBasicObject::<init>@5 (line 84) >>>>> ; - >>>>> org.jruby.RubyObject::<init>@2 (line 116) >>>>> ; - >>>>> org.jruby.RubyNumeric::<init>@2 (line 114) >>>>> ; - >>>>> org.jruby.RubyInteger::<init>@2 (line 95) >>>>> ; - >>>>> org.jruby.RubyFixnum::<init>@5 (line 112) >>>>> ; - >>>>> org.jruby.RubyFixnum::newFixnum@25 (line 173) >>>>> ; - >>>>> org.jruby.RubyFixnum::op_minus@34 (line 395) >>>>> ; - >>>>> org.jruby.runtime.callsite.MinusCallSite::call@24 (line 16) >>>>> 0x02868f4b: mov [eax+0x10], ebx >>>>> 0x02868f4e: lock add [esp], 0x00 ;*putfield varTable >>>>> ; - >>>>> org.jruby.RubyBasicObject::<init>@8 (line 84) >>>>> ; - >>>>> org.jruby.RubyObject::<init>@2 (line 116) >>>>> ; - >>>>> org.jruby.RubyNumeric::<init>@2 (line 114) >>>>> ; - >>>>> org.jruby.RubyInteger::<init>@2 (line 95) >>>>> ; - >>>>> org.jruby.RubyFixnum::<init>@5 (line 112) >>>>> ; - >>>>> org.jruby.RubyFixnum::newFixnum@25 (line 173) >>>>> ; - >>>>> org.jruby.RubyFixnum::op_minus@34 (line 395) >>>>> ; - >>>>> org.jruby.runtime.callsite.MinusCallSite::call@24 (line 16) >>>>> >>>>> On Fri, May 20, 2011 at 8:10 AM, Charles Oliver Nutter >>>>> <head...@headius.com> wrote: >>>>>> On Fri, May 20, 2011 at 1:07 AM, Ola Bini<ola.b...@gmail.com> wrote: >>>>>>>> Try -XX:+UnlockDiag* -XX:-UseRicochetFrames for differential testing. >>>>>>>> >>>>>>>> If you can find a microbenchmark which shows differential slowdown, we >>>>>>>> can debug it. >>>>>>> When trying the above all my tests fail with this: >>>>>>> java.lang.RuntimeException: java.lang.InternalError: NYI >>>>>> Same here. >>>>>> >>>>>> Reproducing it is pretty easy... >>>>>> >>>>>> * Clone JRuby: git clone https://github.com/jruby/jruby.git >>>>>> * Build JRuby: cd jruby ; ant >>>>>> * Run benchmarks: bin/jruby bench/bench_fib_recursive.rb 10 35 >>>>>> >>>>>> There are many benchmarks under the bench dir, almost all of which >>>>>> show this degradation. You can disable invokedynamic use in JRuby by >>>>>> passing -Xcompile.invokedynamic=false, as seen here: >>>>>> >>>>>> >>>>>> ~/projects/jruby ➔ jruby -Xcompile.invokedynamic=false >>>>>> bench/bench_fib_recursive.rb 5 35 >>>>>> WARNING: Both jruby-complete.jar and jruby.jar are present in the >>>>>> 'lib' directory. Will use jruby.jar >>>>>> 9227465 >>>>>> 1.698000 0.000000 1.698000 ( 1.645000) >>>>>> 9227465 >>>>>> 1.352000 0.000000 1.352000 ( 1.352000) >>>>>> 9227465 >>>>>> 1.364000 0.000000 1.364000 ( 1.364000) >>>>>> 9227465 >>>>>> 1.344000 0.000000 1.344000 ( 1.344000) >>>>>> 9227465 >>>>>> 1.336000 0.000000 1.336000 ( 1.336000) >>>>>> >>>>>> ~/projects/jruby ➔ jruby -Xcompile.invokedynamic=true >>>>>> bench/bench_fib_recursive.rb 5 35 >>>>>> WARNING: Both jruby-complete.jar and jruby.jar are present in the >>>>>> 'lib' directory. Will use jruby.jar >>>>>> 9227465 >>>>>> 3.182000 0.000000 3.182000 ( 3.120000) >>>>>> 9227465 >>>>>> 3.019000 0.000000 3.019000 ( 3.019000) >>>>>> 9227465 >>>>>> 3.195000 0.000000 3.195000 ( 3.195000) >>>>>> 9227465 >>>>>> 3.163000 0.000000 3.163000 ( 3.163000) >>>>>> 9227465 >>>>>> 3.656000 0.000000 3.656000 ( 3.656000) >>>>>> >>>>>> I'm sure something's just broken...this benchmark was easily almost 2x >>>>>> faster than the non-invokedynamic version just a few days ago, and >>>>>> it's now 2-3x slower than non-indy logic. >>>>>> >>>>>> I'm going to be online all day, standing by to help in any way I can. >>>>>> I did do a quick LogCompilation last night and the main thing I >>>>>> noticed was that while it seemed like handles are inlining, the >>>>>> eventual DMH does not inline. That would explain some of the perf >>>>>> degradation, I think...but hopefully not all of it. >>>>>> >>>>>> Whatever you need me to run or whatever information...let me know. >>>>>> >>>>>> ALSO...I got the same error as Ola when running JRuby with a >>>>>> standalone jar file: >>>>>> >>>>>> ~/projects/jruby ➔ java -jar lib/jruby-complete.jar >>>>>> bench/bench_fib_recursive.rb >>>>>> MethodHandle.java:-1:in `invokeExact': java.lang.NoClassDefFoundError: >>>>>> org/jruby/runtime/ThreadContext >>>>>> from MethodHandle.java:-1:in `invokeExact' >>>>>> from bench/bench_fib_recursive.rb:7:in `method__0$RUBY$fib_ruby' >>>>>> from bench/bench_fib_recursive.rb:7:in `method__0$RUBY$fib_ruby' >>>>>> from bench/bench_fib_recursive.rb:7:in `method__0$RUBY$fib_ruby' >>>>>> from bench/bench_fib_recursive.rb:7:in `method__0$RUBY$fib_ruby' >>>>>> >>>>>> As in Ola's case, ThreadContext is a core JRuby class. I do *not* get >>>>>> this error when JRuby is loaded from the bootstrap classloader, so I >>>>>> imagine there's some mix-up with generated bytecode and the >>>>>> classloaders that that bytecode gets loaded into. >>>>>> >>>>>> - Charlie >>>>>> >>>> _______________________________________________ >>>> mlvm-dev mailing list >>>> mlvm-dev@openjdk.java.net >>>> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev >>> _______________________________________________ >>> mlvm-dev mailing list >>> mlvm-dev@openjdk.java.net >>> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev >> >> _______________________________________________ >> mlvm-dev mailing list >> mlvm-dev@openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > _______________________________________________ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev _______________________________________________ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev