Re: Good news, bad news
This message was held because it's too big, so I dropboxed the amd64 dumps here: http://dl.dropbox.com/u/17017035/fib_ruby-asm-64bit.zip On Fri, May 27, 2011 at 9:05 AM, Charles Oliver Nutter wrote: > Hopefully these 64-bit dumps are better now that Tom managed to get a > proper 64-bit hsdis built (hooray!). > > On Thu, May 26, 2011 at 6:36 PM, Charles Oliver Nutter > wrote: >> On Thu, May 26, 2011 at 6:21 PM, Charles Oliver Nutter >> wrote: >>> No worries...I'll repost 32-bit asm in a moment. >> >> Attached. This is just fib again. 64-bit is definitely faster than >> 32-bit for this benchmark, but the older macosx build is also still >> faster than the current mlvm: >> >> old macosx: >> >> 9227465 >> 1.235000 0.00 1.235000 ( 1.154000) >> 9227465 >> 1.012000 0.00 1.012000 ( 1.012000) >> 9227465 >> 0.969000 0.00 0.969000 ( 0.969000) >> 9227465 >> 0.962000 0.00 0.962000 ( 0.962000) >> 9227465 >> 0.953000 0.00 0.953000 ( 0.953000) >> >> current mlvm: >> >> 9227465 >> 1.505000 0.00 1.505000 ( 1.406000) >> 9227465 >> 1.175000 0.00 1.175000 ( 1.174000) >> 9227465 >> 1.77 0.00 1.77 ( 1.771000) >> 9227465 >> 1.255000 0.00 1.255000 ( 1.255000) >> 9227465 >> 1.14 0.00 1.14 ( 1.141000) >> >> - Charlie >> > ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
Oh, forgot to include dynopt numbers under that older macosx build: ~/projects/jruby ➔ jruby -Xcompile.dynopt=true -Xinvokedynamic.invocation=false --server -J-d32 bench/bench_fib_recursive.rb 5 35 9227465 1.366000 0.00 1.366000 ( 1.293000) 9227465 0.949000 0.00 0.949000 ( 0.948000) 9227465 0.965000 0.00 0.965000 ( 0.965000) 9227465 0.932000 0.00 0.932000 ( 0.932000) 9227465 0.924000 0.00 0.924000 ( 0.924000) As mentioned before, the old build was just a hair slower than dynopt. - Charlie On Thu, May 26, 2011 at 6:36 PM, Charles Oliver Nutter wrote: > On Thu, May 26, 2011 at 6:21 PM, Charles Oliver Nutter > wrote: >> No worries...I'll repost 32-bit asm in a moment. > > Attached. This is just fib again. 64-bit is definitely faster than > 32-bit for this benchmark, but the older macosx build is also still > faster than the current mlvm: > > old macosx: > > 9227465 > 1.235000 0.00 1.235000 ( 1.154000) > 9227465 > 1.012000 0.00 1.012000 ( 1.012000) > 9227465 > 0.969000 0.00 0.969000 ( 0.969000) > 9227465 > 0.962000 0.00 0.962000 ( 0.962000) > 9227465 > 0.953000 0.00 0.953000 ( 0.953000) > > current mlvm: > > 9227465 > 1.505000 0.00 1.505000 ( 1.406000) > 9227465 > 1.175000 0.00 1.175000 ( 1.174000) > 9227465 > 1.77 0.00 1.77 ( 1.771000) > 9227465 > 1.255000 0.00 1.255000 ( 1.255000) > 9227465 > 1.14 0.00 1.14 ( 1.141000) > > - Charlie > ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
I'll see what I can do on my mac. tom On May 26, 2011, at 3:45 PM, Charles Oliver Nutter wrote: > On Thu, May 26, 2011 at 5:16 PM, John Rose wrote: >> I posted the bsd/amd64 libdis when I was desperate for something that would >> at least link and partially decode instructions. But that hsdis binary >> doesn't really know about x64 disassembly. (The base project doesn't >> support x64. The gnu binutils disassembler is needed.) The download should >> have a warning label; sorry. > > Ahh...well is there another binary you can point me at? Otherwise, > I'll just do i386 asm for now. > > - Charlie > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
On Thu, May 26, 2011 at 6:00 PM, John Rose wrote: > I tagged it. It's broken in the sense that it's a brutal 64-bit build of > code designed only for 32-bit disassembly. > > I suggest doing at least some of these investigations with 32-bit builds. > (But, we do need a binutils-based hsdis!) No worries...I'll repost 32-bit asm in a moment. - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
On May 26, 2011, at 3:45 PM, Charles Oliver Nutter wrote: > Ahh...well is there another binary you can point me at? Otherwise, > I'll just do i386 asm for now. I tagged it. It's broken in the sense that it's a brutal 64-bit build of code designed only for 32-bit disassembly. I suggest doing at least some of these investigations with 32-bit builds. (But, we do need a binutils-based hsdis!) -- John ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
On Thu, May 26, 2011 at 5:16 PM, John Rose wrote: > I posted the bsd/amd64 libdis when I was desperate for something that would > at least link and partially decode instructions. But that hsdis binary > doesn't really know about x64 disassembly. (The base project doesn't support > x64. The gnu binutils disassembler is needed.) The download should have a > warning label; sorry. Ahh...well is there another binary you can point me at? Otherwise, I'll just do i386 asm for now. - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
I see someone tagged that build as "broken" now on Kenai. Can someone with the appropriate skills do a non-broken build? I've never build the hsdis plugin (but I'm attempting to now. - Charlie On Thu, May 26, 2011 at 4:43 PM, Charles Oliver Nutter wrote: > Hmm, I'll look into it. I used the amd64 build from the hsdis project > on kenai... > > On Thu, May 26, 2011 at 4:27 PM, Tom Rodriguez > wrote: >> There's something deeply bizarro about that assembly. It looks like x64 >> assembly decoded by the 32 assembler. I think your hsdis wasn't built right. >> >> tom >> >> On May 26, 2011, at 2:13 PM, Charles Oliver Nutter wrote: >> >>> On Thu, May 26, 2011 at 1:58 AM, Tom Rodriguez >>> wrote: > Great to hear this is in! I'll look for an MLVM patch and until then > see if I can figure out why the "classic" GWT logic isn't as fast as > it was a couple weeks ago. IF I'm feeling saucy I may try to pull MLVM > + bsd-port from two weeks ago (or find a macosx port build from that > timeframe) to compare assembly output. Perhaps illustrative for us > all! I think that would be informative. >>> >>> Ok, I've attached two assembly dumps, one from mostly-current MLVM >>> (build early this morning) and one from the macosx port from 5/13. >>> This is just fib_ruby. I haven't started to explore the differences >>> yet. >>> >>> - Charlie >>> ___ >>> mlvm-dev mailing list >>> mlvm-dev@openjdk.java.net >>> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev >> >> ___ >> mlvm-dev mailing list >> mlvm-dev@openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev >> > ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
On May 26, 2011, at 2:43 PM, Charles Oliver Nutter wrote: > Hmm, I'll look into it. I used the amd64 build from the hsdis project I posted the bsd/amd64 libdis when I was desperate for something that would at least link and partially decode instructions. But that hsdis binary doesn't really know about x64 disassembly. (The base project doesn't support x64. The gnu binutils disassembler is needed.) The download should have a warning label; sorry. -- John ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
Hmm, I'll look into it. I used the amd64 build from the hsdis project on kenai... On Thu, May 26, 2011 at 4:27 PM, Tom Rodriguez wrote: > There's something deeply bizarro about that assembly. It looks like x64 > assembly decoded by the 32 assembler. I think your hsdis wasn't built right. > > tom > > On May 26, 2011, at 2:13 PM, Charles Oliver Nutter wrote: > >> On Thu, May 26, 2011 at 1:58 AM, Tom Rodriguez >> wrote: Great to hear this is in! I'll look for an MLVM patch and until then see if I can figure out why the "classic" GWT logic isn't as fast as it was a couple weeks ago. IF I'm feeling saucy I may try to pull MLVM + bsd-port from two weeks ago (or find a macosx port build from that timeframe) to compare assembly output. Perhaps illustrative for us all! >>> >>> I think that would be informative. >> >> Ok, I've attached two assembly dumps, one from mostly-current MLVM >> (build early this morning) and one from the macosx port from 5/13. >> This is just fib_ruby. I haven't started to explore the differences >> yet. >> >> - Charlie >> ___ >> mlvm-dev mailing list >> mlvm-dev@openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
There's something deeply bizarro about that assembly. It looks like x64 assembly decoded by the 32 assembler. I think your hsdis wasn't built right. tom On May 26, 2011, at 2:13 PM, Charles Oliver Nutter wrote: > On Thu, May 26, 2011 at 1:58 AM, Tom Rodriguez > wrote: >>> Great to hear this is in! I'll look for an MLVM patch and until then >>> see if I can figure out why the "classic" GWT logic isn't as fast as >>> it was a couple weeks ago. IF I'm feeling saucy I may try to pull MLVM >>> + bsd-port from two weeks ago (or find a macosx port build from that >>> timeframe) to compare assembly output. Perhaps illustrative for us >>> all! >> >> I think that would be informative. > > Ok, I've attached two assembly dumps, one from mostly-current MLVM > (build early this morning) and one from the macosx port from 5/13. > This is just fib_ruby. I haven't started to explore the differences > yet. > > - Charlie > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
On Thu, May 26, 2011 at 6:18 AM, Rémi Forax wrote: > For the metaclass, the idea is to not retrieve the metaclass in the fast > path, > if you have one Java class for one ruby class and a 1 to 1 relation between > a class and a metaclass, checking the class and the SwitchPoint of the > metaclass > is sufficient (because you can bind the switch point). Right, this would work if all classes defined in Ruby were 1:1 with a class in Java. This is not the case in JRuby; user-defined classes are rarely backed by a new JVM class, and usually just use RubyObject as their JVM representation. I will likely use this strategy for core JRuby classes, however, by making sure that RubyObject, RubyString, etc are not used as the backing store for user-defined classes, and instead having RubyInheritedObject, RubyInheritedString (or something) for that purpose. - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
On 05/26/2011 12:33 PM, Charles Oliver Nutter wrote: > On Thu, May 26, 2011 at 5:25 AM, Rémi Forax wrote: >> Charles, >> why do you use IRubyObject exactly ? >> why not using Object instead ? >> >> I use Object in PHP.reboot and see no problem. > Legacy, mostly... JRuby still supports Java 5 and 6, which limits how > specialized we can make call paths. Requiring that everything be > IRubyObject means we always have only a single invokeinterface to > retrieve metaclass, access "instance variables", and so on. > > JRuby designed from scratch today would certainly use Object. For the metaclass, the idea is to not retrieve the metaclass in the fast path, if you have one Java class for one ruby class and a 1 to 1 relation between a class and a metaclass, checking the class and the SwitchPoint of the metaclass is sufficient (because you can bind the switch point). > - Charlie Rémi ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
On Thu, May 26, 2011 at 5:25 AM, Rémi Forax wrote: > Charles, > why do you use IRubyObject exactly ? > why not using Object instead ? > > I use Object in PHP.reboot and see no problem. Legacy, mostly... JRuby still supports Java 5 and 6, which limits how specialized we can make call paths. Requiring that everything be IRubyObject means we always have only a single invokeinterface to retrieve metaclass, access "instance variables", and so on. JRuby designed from scratch today would certainly use Object. - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
Charles, why do you use IRubyObject exactly ? why not using Object instead ? I use Object in PHP.reboot and see no problem. Rémi On 05/26/2011 09:15 AM, Charles Oliver Nutter wrote: > On Thu, May 26, 2011 at 1:58 AM, Tom Rodriguez > wrote: >> but test is some ugly goo because of boxing. It's relatively easy to get >> the optimizer to fold away the boxing for boolean but sadly it doesn't help >> the performance at all. Additionally that ends up touching a fair amount of >> common code which makes it a little more risky for 7. > It is definitely frustrating to see that perf has degraded so much the > past couple weeks and still not be there with the reverted code. I'm > hoping your fix will help bring the promised ricochet perf to JRuby, > but so far we're a long way off from what perf looked like on earlier > builds. There must be something more obvious than boolean box > folding...perhaps obvious enough we're not seeing it yet. > >> One oddity I've noticed is that what we're ending up with lots of checkcasts >> to IRubyObject but the optimizer doesn't eliminate redundant ones, probably >> because it's a checkcast to an interface. > IRubyObject is a continuing thorn in my side, it seems! I know there's > an outstanding issue where Hotspot won't see two IRubyObject > implementers that share the same implementation of a given method and > inline that method for both of them. It's not seeing through the > differing types to know that the method body is the same. I wonder if > something like that could be at play here? Too many IRubyObject > implementers through the same logic, so it doesn't ever eliminate the > casts? > >>> Great to hear this is in! I'll look for an MLVM patch and until then >>> see if I can figure out why the "classic" GWT logic isn't as fast as >>> it was a couple weeks ago. IF I'm feeling saucy I may try to pull MLVM >>> + bsd-port from two weeks ago (or find a macosx port build from that >>> timeframe) to compare assembly output. Perhaps illustrative for us >>> all! >> I think that would be informative. > It shall be done! > > - Charlie > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
On Thu, May 26, 2011 at 3:42 AM, Christian Thalinger wrote: > I also did some performance testing. I used a recent HS repository, applied > Tom's patch, used a recent meth.jar provided by John and this is what I got: Somehow I stumbled past this reply...comments inline below. > intelsdv07:~/mlvm/jruby$ bin/jruby.sh -J-Xbootclasspath/p:meth.jar --server > -Xcompile.invokedynamic=false bench/bench_fib_recursive.rb 10 35 ... > 0.979000 0.00 0.979000 ( 0.979000) > > intelsdv07:~/mlvm/jruby$ bin/jruby.sh -J-Xbootclasspath/p:meth.jar --server > -Xcompile.invokedynamic=false -Xcompile.dynopt=true > bench/bench_fib_recursive.rb 10 35 ... > 0.762000 0.00 0.762000 ( 0.762000) > > intelsdv07:~/mlvm/jruby$ bin/jruby.sh -J-Xbootclasspath/p:meth.jar --server > -Xcompile.invokedynamic=true bench/bench_fib_recursive.rb 10 35 ... > 0.92 0.00 0.92 ( 0.92) > > Here are the numbers from April 28 on the same machine: > > http://mail.openjdk.java.net/pipermail/mlvm-dev/2011-April/002873.html > > Although I can't remember if the invokedynamic numbers were with or without > the guard-removal-hack I tried at that time (I think without, because with > the hack the numbers should be very similar to dynopt, see: > http://mail.openjdk.java.net/pipermail/mlvm-dev/2011-April/002869.html). My numbers from that time period (or slightly after, in macosx port) match yours and were based only on code that got into MLVM patchset or into OpenJDK mainline (via macosx port for me), so I don't think they were with the hack. Something definitely degraded between then and now, since *current* JRuby logic is almost always faster on the old builds than on the new ones. I'm not sure yet what it is, but the GWT optimization "opportunity" was obviously part of the original degradation. There are likely other niblets in there somewhere. > Using Tom's new code doesn't make a difference since I think the meth.jar > from John includes the reverted GWT code. I'm rebuilding current MLVM now to run some new tests with the "real" code from Tom, but it worries me that even the reverted GWT code shows degradation compared to late April indy logic. Perhaps Tom is right about the impact of removing your inlining threshold increases? That is in my mind a strong candidate for explaining why perf on these small benchmarks has degraded so much. At the end of the day, I think it's reasonable to have dynopt perf as an "ideal" goal for indy...so I'm proceeding from that point. > A quick check on bench_string_ops.rb shows that a couple of numbers are > better but some are worse, especially string == comparison (compare with > http://mail.openjdk.java.net/pipermail/mlvm-dev/2011-April/002887.html): I will have a look at these in more detail as time permits... - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
On Thu, May 26, 2011 at 4:47 AM, Charles Oliver Nutter wrote: > The current logic is 2x faster on the 5/13 build (b141ish maybe?) of By "current logic" I mean "the logic in JRuby master right now". Same JRuby logic, compared between an earlier macosx build and current mlvm build, and the macosx build is always faster. - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
On Thu, May 26, 2011 at 4:03 AM, Christian Thalinger wrote: > Is there some additional code that got added to the GWT logic or maybe some > additional code on the JRuby side that would explain the slowdown? The current logic is 2x faster on the 5/13 build (b141ish maybe?) of macosx port than on MLVM as of Tuesday. I'll update to latest MLVM and see what the situation is. I also can get asm dumps from that 5/13 build, which I'll post (alongside current asm dumps) in the next 24hrs or so. - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
On Thu, May 26, 2011 at 3:34 AM, John Rose wrote: > I did some more cherry-picking this evening and the mlvm patch queue is now > closer to the reality of what's in jdk7 b144. > > I also pushed Tom's bimorphic MH inlining code, on an experimental basis. Ok, I'll rebuild and use current stuff for exploration. - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
On May 26, 2011, at 11:24 AM, Jochen Theodorou wrote: > Am 26.05.2011 11:03, schrieb Christian Thalinger: > [...] >> I looked at the inlining tree of fib and everything looks good and is >> inlined. There is one invokeExact which is: >> >> @ 43 java.lang.invoke.MethodHandle::invokeExact (42 bytes) too >> big >> >> but bumping the MaxInlineSize just shows that it's: >> >> @ 43 java.lang.invoke.MethodHandle::invokeExact (42 bytes) call >> site not reached >> >> so it doesn't matter anyway. > > sorry for gutting in here. Is that looking at the inlining tree > something normal mortals can also do? I see there some useful > information, even without invokedynamic. So is this some debugging code > not found in the normal JVM? Since a couple of HotSpot builds these switches are also available in product builds: PrintCompilation is a product switch, PrintInlining is a diagnostic switch. Do it like: $ java -XX:+PrintCompilation -XX:+UnlockDiagnosticVMOptions -XX:+PrintInlining -- Christian ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
Am 26.05.2011 11:03, schrieb Christian Thalinger: [...] > I looked at the inlining tree of fib and everything looks good and is > inlined. There is one invokeExact which is: > > @ 43 java.lang.invoke.MethodHandle::invokeExact (42 bytes) too > big > > but bumping the MaxInlineSize just shows that it's: > > @ 43 java.lang.invoke.MethodHandle::invokeExact (42 bytes) call > site not reached > > so it doesn't matter anyway. sorry for gutting in here. Is that looking at the inlining tree something normal mortals can also do? I see there some useful information, even without invokedynamic. So is this some debugging code not found in the normal JVM? bye Jochen ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
On May 26, 2011, at 2:12 AM, Ola Bini wrote: > Hey John, > > Something in the patch queue fails to apply cleanly: That was quick! I noticed that when I spun the meth.jar. Fixed! -- John ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
On May 26, 2011, at 1:42 AM, Christian Thalinger wrote: > Using Tom's new code doesn't make a difference since I think the meth.jar > from John includes the reverted GWT code. I just added a hook to test whether the GWT change matters or not. Choose either flag setting: java -Djava.lang.invoke.GWT_FORCE_RICOCHET_FRAMES=true java -Djava.lang.invoke.GWT_FORCE_RICOCHET_FRAMES=false If false or missing, you get the old "reverted" logic. This is the refreshed JAR: http://cr.openjdk.java.net/~jrose/pres/indy-javadoc-mlvm/meth.jar The code is also in the mlvm patch repo. > A quick check on bench_string_ops.rb shows that a couple of numbers are > better but some are worse, especially string == comparison (compare with > http://mail.openjdk.java.net/pipermail/mlvm-dev/2011-April/002887.html): This rings a bell: I recently had to fix the primitive unboxing logic to make this sort of thing work correctly (compatibly with reflection): MethodHandle iid = identity(int.class); Object o = (p ? (Byte)(byte)1 : (Integer)2); int x = (int) iid.asType(methodType(int.class, Object.class).invokeExact(o); Note that simply casting to Integer and unboxing won't work. So there's a tricky (and potentially slow) test to do in the adaptation of object arguments to primitive formal parameters, or from object return types to primitive return values. If this is a problem, your assembly code might have calls to something ugly called ValueConversions.primitiveConversion. Hopefully, it just has fast-path code which verifies that a reference is the expected box type, and unboxes. There is always a fast path which does this. But funny wrapper mixing (like the Byte above) will cause the slow path to kick in. -- John___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
On 2011-05-26 14.04, John Rose wrote: > On May 26, 2011, at 12:15 AM, Charles Oliver Nutter wrote: > >> It is definitely frustrating to see that perf has degraded so much the >> past couple weeks and still not be there with the reverted code. > > I second that. > > I did some more cherry-picking this evening and the mlvm patch queue is now > closer to the reality of what's in jdk7 b144. > > I also pushed Tom's bimorphic MH inlining code, on an experimental basis. > Hey John, Something in the patch queue fails to apply cleanly: + (cd sources/jdk; hg qpush -a) (working directory not at a head) applying meth-ing-7034977.patch applying meth-conv-6939861.patch applying meth-invoke-7032850.patch applying meth-exc-7044892.patch applying meth-review-7032323.patch applying meth-deprecate.patch patching file src/share/classes/java/lang/invoke/MethodHandle.java Hunk #1 FAILED at 505 1 out of 2 hunks FAILED -- saving rejects to file src/share/classes/java/lang/invoke/MethodHandle.java.rej patch failed, unable to continue (try -v) patch failed, rejects left in working dir errors during apply, please fix and refresh meth-deprecate.patch *** Exit status 2. + (cd sources/langtools; hg qpush -a) Cheers -- Ola Bini (http://olabini.com) Ioke - JRuby - ThoughtWorks "Yields falsehood when quined" yields falsehood when quined. ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
On May 26, 2011, at 10:42 AM, Christian Thalinger wrote: > On May 26, 2011, at 9:15 AM, Charles Oliver Nutter wrote: >> On Thu, May 26, 2011 at 1:58 AM, Tom Rodriguez >> wrote: >>> but test is some ugly goo because of boxing. It's relatively easy to get >>> the optimizer to fold away the boxing for boolean but sadly it doesn't help >>> the performance at all. Additionally that ends up touching a fair amount >>> of common code which makes it a little more risky for 7. >> >> It is definitely frustrating to see that perf has degraded so much the >> past couple weeks and still not be there with the reverted code. I'm >> hoping your fix will help bring the promised ricochet perf to JRuby, >> but so far we're a long way off from what perf looked like on earlier >> builds. There must be something more obvious than boolean box >> folding...perhaps obvious enough we're not seeing it yet. > > I also did some performance testing. I used a recent HS repository, applied > Tom's patch, used a recent meth.jar provided by John and this is what I got: > Here are the numbers from April 28 on the same machine: > > http://mail.openjdk.java.net/pipermail/mlvm-dev/2011-April/002873.html > > Although I can't remember if the invokedynamic numbers were with or without > the guard-removal-hack I tried at that time (I think without, because with > the hack the numbers should be very similar to dynopt, see: > http://mail.openjdk.java.net/pipermail/mlvm-dev/2011-April/002869.html). > > Using Tom's new code doesn't make a difference since I think the meth.jar > from John includes the reverted GWT code. I looked at the inlining tree of fib and everything looks good and is inlined. There is one invokeExact which is: @ 43 java.lang.invoke.MethodHandle::invokeExact (42 bytes) too big but bumping the MaxInlineSize just shows that it's: @ 43 java.lang.invoke.MethodHandle::invokeExact (42 bytes) call site not reached so it doesn't matter anyway. Is there some additional code that got added to the GWT logic or maybe some additional code on the JRuby side that would explain the slowdown? -- Christian ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
On May 26, 2011, at 9:15 AM, Charles Oliver Nutter wrote: > On Thu, May 26, 2011 at 1:58 AM, Tom Rodriguez > wrote: >> but test is some ugly goo because of boxing. It's relatively easy to get >> the optimizer to fold away the boxing for boolean but sadly it doesn't help >> the performance at all. Additionally that ends up touching a fair amount of >> common code which makes it a little more risky for 7. > > It is definitely frustrating to see that perf has degraded so much the > past couple weeks and still not be there with the reverted code. I'm > hoping your fix will help bring the promised ricochet perf to JRuby, > but so far we're a long way off from what perf looked like on earlier > builds. There must be something more obvious than boolean box > folding...perhaps obvious enough we're not seeing it yet. I also did some performance testing. I used a recent HS repository, applied Tom's patch, used a recent meth.jar provided by John and this is what I got: intelsdv07:~/mlvm/jruby$ bin/jruby.sh -J-Xbootclasspath/p:meth.jar --server -Xcompile.invokedynamic=false bench/bench_fib_recursive.rb 10 35 1.107000 0.00 1.107000 ( 1.078000) 0.982000 0.00 0.982000 ( 0.982000) 0.981000 0.00 0.981000 ( 0.981000) 0.98 0.00 0.98 ( 0.98) 0.98 0.00 0.98 ( 0.98) 0.979000 0.00 0.979000 ( 0.979000) 0.98 0.00 0.98 ( 0.98) 0.98 0.00 0.98 ( 0.98) 0.989000 0.00 0.989000 ( 0.989000) 0.979000 0.00 0.979000 ( 0.979000) intelsdv07:~/mlvm/jruby$ bin/jruby.sh -J-Xbootclasspath/p:meth.jar --server -Xcompile.invokedynamic=false -Xcompile.dynopt=true bench/bench_fib_recursive.rb 10 35 0.895000 0.00 0.895000 ( 0.864000) 0.761000 0.00 0.761000 ( 0.761000) 0.763000 0.00 0.763000 ( 0.763000) 0.763000 0.00 0.763000 ( 0.763000) 0.763000 0.00 0.763000 ( 0.763000) 0.765000 0.00 0.765000 ( 0.765000) 0.764000 0.00 0.764000 ( 0.764000) 0.762000 0.00 0.762000 ( 0.762000) 0.772000 0.00 0.772000 ( 0.772000) 0.762000 0.00 0.762000 ( 0.762000) intelsdv07:~/mlvm/jruby$ bin/jruby.sh -J-Xbootclasspath/p:meth.jar --server -Xcompile.invokedynamic=true bench/bench_fib_recursive.rb 10 35 1.074000 0.00 1.074000 ( 1.045000) 0.929000 0.00 0.929000 ( 0.928000) 0.925000 0.00 0.925000 ( 0.925000) 0.925000 0.00 0.925000 ( 0.925000) 0.924000 0.00 0.924000 ( 0.924000) 0.923000 0.00 0.923000 ( 0.923000) 0.923000 0.00 0.923000 ( 0.923000) 0.935000 0.00 0.935000 ( 0.935000) 0.926000 0.00 0.926000 ( 0.926000) 0.92 0.00 0.92 ( 0.92) Here are the numbers from April 28 on the same machine: http://mail.openjdk.java.net/pipermail/mlvm-dev/2011-April/002873.html Although I can't remember if the invokedynamic numbers were with or without the guard-removal-hack I tried at that time (I think without, because with the hack the numbers should be very similar to dynopt, see: http://mail.openjdk.java.net/pipermail/mlvm-dev/2011-April/002869.html). Using Tom's new code doesn't make a difference since I think the meth.jar from John includes the reverted GWT code. A quick check on bench_string_ops.rb shows that a couple of numbers are better but some are worse, especially string == comparison (compare with http://mail.openjdk.java.net/pipermail/mlvm-dev/2011-April/002887.html): intelsdv07:~/mlvm/jruby$ bin/jruby.sh -J-Xbootclasspath/p:meth.jar --server -Xcompile.invokedynamic=true -J-XX:+UnlockDiagnosticVMOptions -J-XX:+UseNewCode2 bench/bench_string_ops.rb Measure string array sort time 2.726000 0.00 2.726000 ( 2.698000) 2.651000 0.00 2.651000 ( 2.651000) 2.648000 0.00 2.648000 ( 2.648000) 2.647000 0.00 2.647000 ( 2.647000) 2.632000 0.00 2.632000 ( 2.632000) Measure hash put time 0.184000 0.00 0.184000 ( 0.184000) 0.096000 0.00 0.096000 ( 0.096000) 0.096000 0.00 0.096000 ( 0.096000) 0.10 0.00 0.10 ( 0.10) 0.093000 0.00 0.093000 ( 0.093000) Measure hash get time (note: not same scale as put test) 0.364000 0.00 0.364000 ( 0.364000) 0.251000 0.00 0.251000 ( 0.25) 0.251000 0.00 0.251000 ( 0.251000) 0.258000 0.00 0.258000 ( 0.258000) 0.25 0.00 0.25 ( 0.25) Measure string == comparison time 0.511000 0.00 0.511000 ( 0.511000) 0.314000 0.00 0.314000 ( 0.314000) 0.309000 0.00 0.309000 ( 0.309000) 0.31 0.00 0.31 ( 0.31) 0.31 0.00 0.31 ( 0.31) Measure string == comparison time, different last pos 0.549000 0.00 0.549000 ( 0.549000) 0.367000 0.00 0.367000 ( 0.366000) 0.367000 0.00 0.367000
Re: Good news, bad news
On May 26, 2011, at 12:15 AM, Charles Oliver Nutter wrote: > It is definitely frustrating to see that perf has degraded so much the > past couple weeks and still not be there with the reverted code. I second that. I did some more cherry-picking this evening and the mlvm patch queue is now closer to the reality of what's in jdk7 b144. I also pushed Tom's bimorphic MH inlining code, on an experimental basis. -- John ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
On Thu, May 26, 2011 at 1:58 AM, Tom Rodriguez wrote: > but test is some ugly goo because of boxing. It's relatively easy to get the > optimizer to fold away the boxing for boolean but sadly it doesn't help the > performance at all. Additionally that ends up touching a fair amount of > common code which makes it a little more risky for 7. It is definitely frustrating to see that perf has degraded so much the past couple weeks and still not be there with the reverted code. I'm hoping your fix will help bring the promised ricochet perf to JRuby, but so far we're a long way off from what perf looked like on earlier builds. There must be something more obvious than boolean box folding...perhaps obvious enough we're not seeing it yet. > One oddity I've noticed is that what we're ending up with lots of checkcasts > to IRubyObject but the optimizer doesn't eliminate redundant ones, probably > because it's a checkcast to an interface. IRubyObject is a continuing thorn in my side, it seems! I know there's an outstanding issue where Hotspot won't see two IRubyObject implementers that share the same implementation of a given method and inline that method for both of them. It's not seeing through the differing types to know that the method body is the same. I wonder if something like that could be at play here? Too many IRubyObject implementers through the same logic, so it doesn't ever eliminate the casts? >> Great to hear this is in! I'll look for an MLVM patch and until then >> see if I can figure out why the "classic" GWT logic isn't as fast as >> it was a couple weeks ago. IF I'm feeling saucy I may try to pull MLVM >> + bsd-port from two weeks ago (or find a macosx port build from that >> timeframe) to compare assembly output. Perhaps illustrative for us >> all! > > I think that would be informative. It shall be done! - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
On May 25, 2011, at 9:39 PM, Charles Oliver Nutter wrote: > On Wed, May 25, 2011 at 5:12 PM, Tom Rodriguez > wrote: >> I've got the bimorphic method handle inline working. I was doing something >> stupid at the beginning which is why it wasn't as easy as I expected to be >> and then I encountered a stupid API which wasted a little more time. I was >> able to leverage the existing invokedynamic call site cache code generation >> and just generate a different guard. Ignoring the flag soup below, the >> first is without invokedynamic, second is with and third is with bimorphic >> inline for selectAlternative. > > Here's a dumb question...couldn't the bimorphic logic have simply been > generated the second way you indicated, rather than the (x ? a : > b).whatever form? That pattern is something the java.lang.invoke machinery is generating as far I can tell. selectAlternative does something like: (test ? a : b).invokeExact() and I'm converting it to something like: c = (test ? a : b); if (c == a) a.invokeExact(); else b.invokeExact(); The optimizer turns this into: if (test) a.invokeExact(); else b.invokeExact(); but test is some ugly goo because of boxing. It's relatively easy to get the optimizer to fold away the boxing for boolean but sadly it doesn't help the performance at all. Additionally that ends up touching a fair amount of common code which makes it a little more risky for 7. One oddity I've noticed is that what we're ending up with lots of checkcasts to IRubyObject but the optimizer doesn't eliminate redundant ones, probably because it's a checkcast to an interface. > > Great to hear this is in! I'll look for an MLVM patch and until then > see if I can figure out why the "classic" GWT logic isn't as fast as > it was a couple weeks ago. IF I'm feeling saucy I may try to pull MLVM > + bsd-port from two weeks ago (or find a macosx port build from that > timeframe) to compare assembly output. Perhaps illustrative for us > all! I think that would be informative. tom > >> The changes are at http://javaweb.sfbay.sun.com/~never/webrev/bim. Maybe >> John can put together an mlvm patch for it. I'm not quite sure how to do so. >> >> There are still some minor improvements that can be made. The test part >> ends up looking like: >> >> if ((test ? Boolean.TRUE : Boolean.FALSE).value()) >> >> We need to treat more of those values as constants and split the loads up to >> fold it away. > > I'm keen to have a look at the assembly resulting from the bimorphic > inlining and see if there's any way to improve it further. My test > could potentially be improved too. > > I'm also pondering ways I can go mostly guard-free for certain core > JRuby types, possibly reducing the guard to an exact type check. > There's definitely runway both on your end and my end. > > - Charlie > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
On Wed, May 25, 2011 at 5:12 PM, Tom Rodriguez wrote: > I've got the bimorphic method handle inline working. I was doing something > stupid at the beginning which is why it wasn't as easy as I expected to be > and then I encountered a stupid API which wasted a little more time. I was > able to leverage the existing invokedynamic call site cache code generation > and just generate a different guard. Ignoring the flag soup below, the first > is without invokedynamic, second is with and third is with bimorphic inline > for selectAlternative. Here's a dumb question...couldn't the bimorphic logic have simply been generated the second way you indicated, rather than the (x ? a : b).whatever form? Great to hear this is in! I'll look for an MLVM patch and until then see if I can figure out why the "classic" GWT logic isn't as fast as it was a couple weeks ago. IF I'm feeling saucy I may try to pull MLVM + bsd-port from two weeks ago (or find a macosx port build from that timeframe) to compare assembly output. Perhaps illustrative for us all! > The changes are at http://javaweb.sfbay.sun.com/~never/webrev/bim. Maybe > John can put together an mlvm patch for it. I'm not quite sure how to do so. > > There are still some minor improvements that can be made. The test part ends > up looking like: > > if ((test ? Boolean.TRUE : Boolean.FALSE).value()) > > We need to treat more of those values as constants and split the loads up to > fold it away. I'm keen to have a look at the assembly resulting from the bimorphic inlining and see if there's any way to improve it further. My test could potentially be improved too. I'm also pondering ways I can go mostly guard-free for certain core JRuby types, possibly reducing the guard to an exact type check. There's definitely runway both on your end and my end. - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
I've got the bimorphic method handle inline working. I was doing something stupid at the beginning which is why it wasn't as easy as I expected to be and then I encountered a stupid API which wasted a little more time. I was able to leverage the existing invokedynamic call site cache code generation and just generate a different guard. Ignoring the flag soup below, the first is without invokedynamic, second is with and third is with bimorphic inline for selectAlternative. ( setenv _JAVA_OPTIONS "-XX:-PrintCompilation -Xbootclasspath/p:/export/ws/twisti/meth.jar -XX:+UseNewCode -Xbatch -XX:-UseNewCode2 -XX:-LogCompilation -XX:-PrintInlining" ; setenv JAVA_HOME /export/ws/indy/i586/jdk1.7.0 ; ./bin/jruby --server -Xcompile.invokedynamic=false bench/bench_fib_recursive.rb 5 35 ) 2.006000 0.00 2.006000 ( 1.856000) 1.391000 0.00 1.391000 ( 1.391000) 1.386000 0.00 1.386000 ( 1.386000) 1.386000 0.00 1.386000 ( 1.386000) 1.385000 0.00 1.385000 ( 1.385000) ( setenv _JAVA_OPTIONS "-XX:-PrintCompilation -Xbootclasspath/p:/export/ws/twisti/meth.jar -XX:+UseNewCode -Xbatch -XX:-UseNewCode2 -XX:-LogCompilation -XX:-PrintInlining" ; setenv JAVA_HOME /export/ws/indy/i586/jdk1.7.0 ; ./bin/jruby --server -Xcompile.invokedynamic=true bench/bench_fib_recursive.rb 5 35 ) 4.235000 0.00 4.235000 ( 4.086000) 3.689000 0.00 3.689000 ( 3.689000) 3.687000 0.00 3.687000 ( 3.688000) 3.68 0.00 3.68 ( 3.68) 3.68 0.00 3.68 ( 3.68) ( setenv _JAVA_OPTIONS "-XX:-PrintCompilation -Xbootclasspath/p:/export/ws/twisti/meth.jar -XX:+UseNewCode -Xbatch -XX:+UseNewCode2 -XX:-LogCompilation -XX:-PrintInlining" ; setenv JAVA_HOME /export/ws/indy/i586/jdk1.7.0 ; ./bin/jruby --server -Xcompile.invokedynamic=true bench/bench_fib_recursive.rb 5 35 )Picked up 2.103000 0.00 2.103000 ( 1.95) 1.281000 0.00 1.281000 ( 1.281000) 1.275000 0.00 1.275000 ( 1.274000) 1.279000 0.00 1.279000 ( 1.279000) 1.288000 0.00 1.288000 ( 1.288000) The changes are at http://javaweb.sfbay.sun.com/~never/webrev/bim. Maybe John can put together an mlvm patch for it. I'm not quite sure how to do so. There are still some minor improvements that can be made. The test part ends up looking like: if ((test ? Boolean.TRUE : Boolean.FALSE).value()) We need to treat more of those values as constants and split the loads up to fold it away. tom On May 25, 2011, at 12:14 PM, John Rose wrote: > On May 25, 2011, at 9:31 AM, Charles Oliver Nutter wrote: > >> Build in progress! I'll let you know how it goes. How low is low in "low >> arity"? > > About 0-9. > > Let's find the next bottleneck! > > -- John > > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
On Wed, May 25, 2011 at 2:14 PM, John Rose wrote: > On May 25, 2011, at 9:31 AM, Charles Oliver Nutter wrote: > >> Build in progress! I'll let you know how it goes. How low is low in "low >> arity"? > > About 0-9. > > Let's find the next bottleneck! That's the spirit! We can do it! - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
On May 25, 2011, at 9:31 AM, Charles Oliver Nutter wrote: > Build in progress! I'll let you know how it goes. How low is low in "low > arity"? About 0-9. Let's find the next bottleneck! -- John ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
Actually now that I look at it, this isn't all that solid performance. It's just *slightly* faster than using JRuby's inline cache, which almost never inlines... Inline cache only, via CachingCallSite. ~/projects/jruby ➔ jruby -Xcompile.invokedynamic=false --server bench/bench_fib_recursive.rb 5 359227465 1.23 0.00 1.23 ( 1.114000) 9227465 1.024000 0.00 1.024000 ( 1.024000) 9227465 1.016000 0.00 1.016000 ( 1.016000) 9227465 0.993000 0.00 0.993000 ( 0.993000) 9227465 1.003000 0.00 1.003000 ( 1.003000) So invokedynamic dispatch is better, but only by a tiny margin. :( I'll start browsing logs and asm dumps later today to see if I can figure out why it's not as fast as before. - Charlie On Wed, May 25, 2011 at 12:00 PM, Charles Oliver Nutter wrote: > On Wed, May 25, 2011 at 11:31 AM, Charles Oliver Nutter > wrote: >> Build in progress! I'll let you know how it goes. How low is low in "low >> arity"? > > Ok, we're definitely back to having solid performance, but it's not as > good as it was before. I'm not sure if it's related to the inlining > thresholds, since bumping them up the same way we used to doesn't get > all the perf back. Here's the numbers I have. It's now around 3x > faster than it was a few days ago, but not as fast as JRuby's dynopt > mode (which inserts a guarded invokevirtual next to a normal IC hit). > The invokedynamic logic was nearly as fast as dynopt before, which was > breathtaking to behold. > > Indy, indy with thresholds bumped, dynopt: > > ~/projects/jruby ➔ jruby --server bench/bench_fib_recursive.rb 5 359227465 > 1.215000 0.00 1.215000 ( 1.155000) > 9227465 > 0.985000 0.00 0.985000 ( 0.985000) > 9227465 > 0.977000 0.00 0.977000 ( 0.977000) > 9227465 > 0.981000 0.00 0.981000 ( 0.981000) > 9227465 > 0.976000 0.00 0.976000 ( 0.976000) > > ~/projects/jruby ➔ jruby -J-XX:MaxInlineSize=150 > -J-XX:InlineSmallCode=3000 --server bench/bench_fib_recursive.rb 5 35 > 9227465 > 1.209000 0.00 1.209000 ( 1.132000) > 9227465 > 0.981000 0.00 0.981000 ( 0.981000) > 9227465 > 0.974000 0.00 0.974000 ( 0.974000) > 9227465 > 0.971000 0.00 0.971000 ( 0.971000) > 9227465 > 0.969000 0.00 0.969000 ( 0.969000) > > ~/projects/jruby ➔ jruby -Xcompile.dynopt=true > -Xcompile.invokedynamic=false --server bench/bench_fib_recursive.rb 5 > 35 > 9227465 > 1.078000 0.00 1.078000 ( 1.013000) > 9227465 > 0.706000 0.00 0.706000 ( 0.706000) > 9227465 > 0.697000 0.00 0.697000 ( 0.697000) > 9227465 > 0.699000 0.00 0.699000 ( 0.699000) > 9227465 > 0.695000 0.00 0.695000 ( 0.695000) > ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
On Wed, May 25, 2011 at 11:31 AM, Charles Oliver Nutter wrote: > Build in progress! I'll let you know how it goes. How low is low in "low > arity"? Ok, we're definitely back to having solid performance, but it's not as good as it was before. I'm not sure if it's related to the inlining thresholds, since bumping them up the same way we used to doesn't get all the perf back. Here's the numbers I have. It's now around 3x faster than it was a few days ago, but not as fast as JRuby's dynopt mode (which inserts a guarded invokevirtual next to a normal IC hit). The invokedynamic logic was nearly as fast as dynopt before, which was breathtaking to behold. Indy, indy with thresholds bumped, dynopt: ~/projects/jruby ➔ jruby --server bench/bench_fib_recursive.rb 5 359227465 1.215000 0.00 1.215000 ( 1.155000) 9227465 0.985000 0.00 0.985000 ( 0.985000) 9227465 0.977000 0.00 0.977000 ( 0.977000) 9227465 0.981000 0.00 0.981000 ( 0.981000) 9227465 0.976000 0.00 0.976000 ( 0.976000) ~/projects/jruby ➔ jruby -J-XX:MaxInlineSize=150 -J-XX:InlineSmallCode=3000 --server bench/bench_fib_recursive.rb 5 35 9227465 1.209000 0.00 1.209000 ( 1.132000) 9227465 0.981000 0.00 0.981000 ( 0.981000) 9227465 0.974000 0.00 0.974000 ( 0.974000) 9227465 0.971000 0.00 0.971000 ( 0.971000) 9227465 0.969000 0.00 0.969000 ( 0.969000) ~/projects/jruby ➔ jruby -Xcompile.dynopt=true -Xcompile.invokedynamic=false --server bench/bench_fib_recursive.rb 5 35 9227465 1.078000 0.00 1.078000 ( 1.013000) 9227465 0.706000 0.00 0.706000 ( 0.706000) 9227465 0.697000 0.00 0.697000 ( 0.697000) 9227465 0.699000 0.00 0.699000 ( 0.699000) 9227465 0.695000 0.00 0.695000 ( 0.695000) ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
On Wed, May 25, 2011 at 12:18 AM, John Rose wrote: > On May 23, 2011, at 5:43 PM, Charles Oliver Nutter wrote: > > Well I have a prototype GWT in place, but I'm having trouble getting > invokeExact to work. As far as I can tell, incoming arguments should > match the handle type just fine, but I'm getting this error: > > Hmm... I don't see the bug in your diff. Looks like someone is trying to > invoke a MH in varargs style (Object...). The error says > "InvokeDynamicSupport.java:934:in `gwt':" but I don't know where this might > be. Yeah, I'm stumped. I'll back off that approach for now and see if the GWT revert gets me back to normal perf. > Anyway, I put GWT back the way it was, for (a) low arity and (b) method > types without primitives. > (For high arity and/or types with primitives, it still uses ricochet frames, > for which Tom found an optimization "opportunity".) > See if it makes things better for you... Build in progress! I'll let you know how it goes. How low is low in "low arity"? - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
On May 25, 2011, at 3:23 AM, Christian Thalinger wrote: > I know that. That's why we need to find out where the problem is (some hints > below). Can someone provide a jar file of the current MLVM JSR 292 classes > (like John's meth.jar)? I just posted it with the javadoc: http://cr.openjdk.java.net/~jrose/pres/indy-javadoc-mlvm/ http://cr.openjdk.java.net/~jrose/pres/indy-javadoc-mlvm/meth.jar http://cr.openjdk.java.net/~jrose/pres/indy-javadoc-mlvm/test-classes.jar Also, here's my setup for running the test classes: $ hg clone http://hg.openjdk.java.net/mlvm/mlvm /tmp/mlvm $ hg clone ssh://hg.kenai.com/ninja~indify-repo /tmp/indify $ ant test -Dproject.indify=/tmp/indify -Dplatforms.JDK_1.7.home=$JAVA7X_BUILD -Dsrc.special-config-test-src.dir='' (It's a twisty path. Your mileage will vary.) -- John ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
Well, on my linux box with a JDK built this morning I don't see any of these problems actually. (However, there are things on my master that only works with the mlvm patches. Type conversions specifically.) Problem 3 also seems to not be there - you would notice if it were since it's also a total JVM crash. I do not see it on Linux with this version: openjdk version "1.7.0-internal" OpenJDK Runtime Environment (build 1.7.0-internal-obini2_2011_05_25_03_33-b00) OpenJDK 64-Bit Server VM (build 21.0-b13, mixed mode) I built it from http://hg.openjdk.java.net/jdk7/jdk7 so it's not an official drop, but it's not bsdport with mlvm either. Cheers On 2011-05-25 14.29, Christian Thalinger wrote: > On May 25, 2011, at 5:58 AM, Ola Bini wrote: >> Hi, >> >> There are at least three problems that are still there. They might be >> connected, or not. >> (I will tell you how to reproduce these at the end) >> >> I just built a new JVM: >> openjdk version "1.7.0-internal" >> OpenJDK Runtime Environment (build >> 1.7.0-internal-olabini_2011_05_25_08_06-b00) >> OpenJDK Server VM (build 21.0-b09, mixed mode) >> >> I get that weird missing class error when running my test suite: >> java.lang.NoClassDefFoundError: seph/lang/SephObject >> java.lang.RuntimeException: java.lang.NoClassDefFoundError: >> seph/lang/SephObject >> at seph.lang.Runtime.evaluateStream(Runtime.java:132) >> at seph.lang.Runtime.evaluateString(Runtime.java:146) >> at >> seph.lang.code.BasicSanityTest.recursive_odd_and_even_that_should_blow_the_stack(BasicSanityTest.java:214) >> Caused by: java.lang.NoClassDefFoundError: seph/lang/SephObject >> at java.lang.invoke.MethodHandle.invokeExact(MethodHandle.java) >> at java.lang.invoke.MethodHandle.invokeExact(MethodHandle.java) >> at seph$gen$abstraction$41$even?.argument_0_0(:7) >> at seph.lang.compiler.Bootstrap.intrinsic_if(Bootstrap.java:654) >> at seph$gen$abstraction$41$even?.even?(:7) >> at seph$gen$abstraction$39$toplevel.toplevel(:25) >> at seph.lang.Runtime.evaluateStream(Runtime.java:125) >> >> This is obviously a complete blocker. >> >> Turning off the ricochet frames hits an NYI error when running the test >> suite. >> >> I'm still seeing a crash in ricochet frames: >> # >> # A fatal error has been detected by the Java Runtime Environment: >> # >> # SIGBUS (0xa) at pc=0x0104884f, pid=85043, tid=2953850880 >> # >> # JRE version: 7.0 >> # Java VM: OpenJDK Client VM (21.0-b09 mixed mode bsd-x86 ) >> # Problematic frame: >> # V [libjvm.dylib+0x4884f] MethodHandles::ricochet_frame_oops_do(frame >> const&, OopClosure*, RegisterMap const*)+0x12f >> # >> # Failed to write core dump. Core dumps have been disabled. To enable >> core dumping, try "ulimit -c unlimited" before starting Java again >> # >> # An error report file with more information is saved as: >> # /Users/olabini/workspace/seph/hs_err_pid85043.log >> # >> # If you would like to submit a bug report, please visit: >> # http://bugreport.sun.com/bugreport/crash.jsp >> # >> >> I've attached the error file. >> >> The third error (in frame.cpp) only happens on master of seph, but there >> are some other problems with master that means you get lots of weird output. >> >> In order to reproduce problem 1 and 2: >> git clone git://github.com/seph-lang/seph.git >> git checkout 9075c0f4ffe1adac0657057aee2193f16ad12a43 >> (build.xml: remove lines with ) >> >> problem 1: >> ant >> This should show you one entry of the missing class problem > > I tried this on solaris-i586 and it works: > > BUILD SUCCESSFUL > Total time: 29 seconds > >> >> problem 2: >> ant jar-notest >> bin/seph bench/bench_arithmetic.sp >> This should show you problem 2. >> If you for reference want to see the benchmark run correctly, do >> bin/seph -J-Xint bench/bench_arithmetic.sp > > Works too (with server compiler on b143). > >> >> problem 3: >> git checkout master >> git checkout 99c8f2609d468835390e39b68c73f21cc78e5ab5 >> ant clean jar-notest >> bin/seph bench/bench_arithmetic.sp >> This should show you problem 3. You will also see loads of other >> exceptions, since that point generates slightly bad bytecode. That >> shouldn't make the JVM crash though, I assume - and I've seen the >> frame.cpp should not reach here without seeing those exceptions. > > I think I see the exception you are talking about but there is so much output > that I'm not sure. > > I curious about where the problem lies, either it's a HotSpot fix that's not > in the mlvm repository yet or it's a JDK bug in one of the mlvm patches which > hasn't been pushed yet into the main repository. At least that's my best > guess. > > May I suggest that you guys also try with an official build (like JDK 7 b143) > from here (assuming everyone has some kind of access to a Linux box too): > > http://jdk7.java.net/download.html > > -- Christian > >> >> All of these require that JAVA_HOME points to the Java 7 you want to use >> >> Cheers >> >>> >>> t
Re: Good news, bad news
On May 25, 2011, at 3:20 AM, Ola Bini wrote: > Just to clarify, my builds are against the current patchset in the MLVM > repository, so that might explain why you're not seeing these problems. Good news: The jdk7 repository just integrated a bunch of critical 292 fixes. I hope the bsd-port can catch up quickly, so we can retest on a better foundation. On the other hand, it's possible that the bugs Ola is seeing are not from a lack of critical fixes, but due to (my) new code in the mlvm patch repo. The sooner we find out the better... Ola, I expect we'll nail this. -- John ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
On May 25, 2011, at 12:20 PM, Ola Bini wrote: > Hi, > > Just to clarify, my builds are against the current patchset in the MLVM > repository, so that might explain why you're not seeing these problems. I know that. That's why we need to find out where the problem is (some hints below). Can someone provide a jar file of the current MLVM JSR 292 classes (like John's meth.jar)? -- Christian > > Cheers > > On 2011-05-25 14.29, Christian Thalinger wrote: >> On May 25, 2011, at 5:58 AM, Ola Bini wrote: >>> Hi, >>> >>> There are at least three problems that are still there. They might be >>> connected, or not. >>> (I will tell you how to reproduce these at the end) >>> >>> I just built a new JVM: >>> openjdk version "1.7.0-internal" >>> OpenJDK Runtime Environment (build >>> 1.7.0-internal-olabini_2011_05_25_08_06-b00) >>> OpenJDK Server VM (build 21.0-b09, mixed mode) >>> >>> I get that weird missing class error when running my test suite: >>> java.lang.NoClassDefFoundError: seph/lang/SephObject >>> java.lang.RuntimeException: java.lang.NoClassDefFoundError: >>> seph/lang/SephObject >>> at seph.lang.Runtime.evaluateStream(Runtime.java:132) >>> at seph.lang.Runtime.evaluateString(Runtime.java:146) >>> at >>> seph.lang.code.BasicSanityTest.recursive_odd_and_even_that_should_blow_the_stack(BasicSanityTest.java:214) >>> Caused by: java.lang.NoClassDefFoundError: seph/lang/SephObject >>> at java.lang.invoke.MethodHandle.invokeExact(MethodHandle.java) >>> at java.lang.invoke.MethodHandle.invokeExact(MethodHandle.java) >>> at seph$gen$abstraction$41$even?.argument_0_0(:7) >>> at seph.lang.compiler.Bootstrap.intrinsic_if(Bootstrap.java:654) >>> at seph$gen$abstraction$41$even?.even?(:7) >>> at seph$gen$abstraction$39$toplevel.toplevel(:25) >>> at seph.lang.Runtime.evaluateStream(Runtime.java:125) >>> >>> This is obviously a complete blocker. >>> >>> Turning off the ricochet frames hits an NYI error when running the test >>> suite. >>> >>> I'm still seeing a crash in ricochet frames: >>> # >>> # A fatal error has been detected by the Java Runtime Environment: >>> # >>> # SIGBUS (0xa) at pc=0x0104884f, pid=85043, tid=2953850880 >>> # >>> # JRE version: 7.0 >>> # Java VM: OpenJDK Client VM (21.0-b09 mixed mode bsd-x86 ) >>> # Problematic frame: >>> # V [libjvm.dylib+0x4884f] MethodHandles::ricochet_frame_oops_do(frame >>> const&, OopClosure*, RegisterMap const*)+0x12f >>> # >>> # Failed to write core dump. Core dumps have been disabled. To enable >>> core dumping, try "ulimit -c unlimited" before starting Java again >>> # >>> # An error report file with more information is saved as: >>> # /Users/olabini/workspace/seph/hs_err_pid85043.log >>> # >>> # If you would like to submit a bug report, please visit: >>> # http://bugreport.sun.com/bugreport/crash.jsp >>> # >>> >>> I've attached the error file. >>> >>> The third error (in frame.cpp) only happens on master of seph, but there >>> are some other problems with master that means you get lots of weird output. >>> >>> In order to reproduce problem 1 and 2: >>> git clone git://github.com/seph-lang/seph.git >>> git checkout 9075c0f4ffe1adac0657057aee2193f16ad12a43 >>> (build.xml: remove lines with ) >>> >>> problem 1: >>> ant >>> This should show you one entry of the missing class problem >> >> I tried this on solaris-i586 and it works: >> >> BUILD SUCCESSFUL >> Total time: 29 seconds >> >>> >>> problem 2: >>> ant jar-notest >>> bin/seph bench/bench_arithmetic.sp >>> This should show you problem 2. >>> If you for reference want to see the benchmark run correctly, do >>> bin/seph -J-Xint bench/bench_arithmetic.sp >> >> Works too (with server compiler on b143). >> >>> >>> problem 3: >>> git checkout master >>> git checkout 99c8f2609d468835390e39b68c73f21cc78e5ab5 >>> ant clean jar-notest >>> bin/seph bench/bench_arithmetic.sp >>> This should show you problem 3. You will also see loads of other >>> exceptions, since that point generates slightly bad bytecode. That >>> shouldn't make the JVM crash though, I assume - and I've seen the >>> frame.cpp should not reach here without seeing those exceptions. >> >> I think I see the exception you are talking about but there is so much >> output that I'm not sure. >> >> I curious about where the problem lies, either it's a HotSpot fix that's not >> in the mlvm repository yet or it's a JDK bug in one of the mlvm patches >> which hasn't been pushed yet into the main repository. At least that's my >> best guess. >> >> May I suggest that you guys also try with an official build (like JDK 7 >> b143) from here (assuming everyone has some kind of access to a Linux box >> too): >> >> http://jdk7.java.net/download.html >> >> -- Christian >> >>> >>> All of these require that JAVA_HOME points to the Java 7 you want to use >>> >>> Cheers >>> tom On May 23, 2011, at 7:33 PM, Ola Bini wrote: > Hi, > > I'm happy to see that t
Re: Good news, bad news
Hi, Just to clarify, my builds are against the current patchset in the MLVM repository, so that might explain why you're not seeing these problems. Cheers On 2011-05-25 14.29, Christian Thalinger wrote: > On May 25, 2011, at 5:58 AM, Ola Bini wrote: >> Hi, >> >> There are at least three problems that are still there. They might be >> connected, or not. >> (I will tell you how to reproduce these at the end) >> >> I just built a new JVM: >> openjdk version "1.7.0-internal" >> OpenJDK Runtime Environment (build >> 1.7.0-internal-olabini_2011_05_25_08_06-b00) >> OpenJDK Server VM (build 21.0-b09, mixed mode) >> >> I get that weird missing class error when running my test suite: >> java.lang.NoClassDefFoundError: seph/lang/SephObject >> java.lang.RuntimeException: java.lang.NoClassDefFoundError: >> seph/lang/SephObject >> at seph.lang.Runtime.evaluateStream(Runtime.java:132) >> at seph.lang.Runtime.evaluateString(Runtime.java:146) >> at >> seph.lang.code.BasicSanityTest.recursive_odd_and_even_that_should_blow_the_stack(BasicSanityTest.java:214) >> Caused by: java.lang.NoClassDefFoundError: seph/lang/SephObject >> at java.lang.invoke.MethodHandle.invokeExact(MethodHandle.java) >> at java.lang.invoke.MethodHandle.invokeExact(MethodHandle.java) >> at seph$gen$abstraction$41$even?.argument_0_0(:7) >> at seph.lang.compiler.Bootstrap.intrinsic_if(Bootstrap.java:654) >> at seph$gen$abstraction$41$even?.even?(:7) >> at seph$gen$abstraction$39$toplevel.toplevel(:25) >> at seph.lang.Runtime.evaluateStream(Runtime.java:125) >> >> This is obviously a complete blocker. >> >> Turning off the ricochet frames hits an NYI error when running the test >> suite. >> >> I'm still seeing a crash in ricochet frames: >> # >> # A fatal error has been detected by the Java Runtime Environment: >> # >> # SIGBUS (0xa) at pc=0x0104884f, pid=85043, tid=2953850880 >> # >> # JRE version: 7.0 >> # Java VM: OpenJDK Client VM (21.0-b09 mixed mode bsd-x86 ) >> # Problematic frame: >> # V [libjvm.dylib+0x4884f] MethodHandles::ricochet_frame_oops_do(frame >> const&, OopClosure*, RegisterMap const*)+0x12f >> # >> # Failed to write core dump. Core dumps have been disabled. To enable >> core dumping, try "ulimit -c unlimited" before starting Java again >> # >> # An error report file with more information is saved as: >> # /Users/olabini/workspace/seph/hs_err_pid85043.log >> # >> # If you would like to submit a bug report, please visit: >> # http://bugreport.sun.com/bugreport/crash.jsp >> # >> >> I've attached the error file. >> >> The third error (in frame.cpp) only happens on master of seph, but there >> are some other problems with master that means you get lots of weird output. >> >> In order to reproduce problem 1 and 2: >> git clone git://github.com/seph-lang/seph.git >> git checkout 9075c0f4ffe1adac0657057aee2193f16ad12a43 >> (build.xml: remove lines with ) >> >> problem 1: >> ant >> This should show you one entry of the missing class problem > > I tried this on solaris-i586 and it works: > > BUILD SUCCESSFUL > Total time: 29 seconds > >> >> problem 2: >> ant jar-notest >> bin/seph bench/bench_arithmetic.sp >> This should show you problem 2. >> If you for reference want to see the benchmark run correctly, do >> bin/seph -J-Xint bench/bench_arithmetic.sp > > Works too (with server compiler on b143). > >> >> problem 3: >> git checkout master >> git checkout 99c8f2609d468835390e39b68c73f21cc78e5ab5 >> ant clean jar-notest >> bin/seph bench/bench_arithmetic.sp >> This should show you problem 3. You will also see loads of other >> exceptions, since that point generates slightly bad bytecode. That >> shouldn't make the JVM crash though, I assume - and I've seen the >> frame.cpp should not reach here without seeing those exceptions. > > I think I see the exception you are talking about but there is so much output > that I'm not sure. > > I curious about where the problem lies, either it's a HotSpot fix that's not > in the mlvm repository yet or it's a JDK bug in one of the mlvm patches which > hasn't been pushed yet into the main repository. At least that's my best > guess. > > May I suggest that you guys also try with an official build (like JDK 7 b143) > from here (assuming everyone has some kind of access to a Linux box too): > > http://jdk7.java.net/download.html > > -- Christian > >> >> All of these require that JAVA_HOME points to the Java 7 you want to use >> >> Cheers >> >>> >>> tom >>> >>> On May 23, 2011, at 7:33 PM, Ola Bini wrote: >>> Hi, I'm happy to see that the performance degradation is getting addressed. I would like to point out that there is still a serious crash in the machinery too... Have you seen any reason why that happens? Cheers On 2011-05-24 06.11, John Rose wrote: > On May 23, 2011, at 3:44 PM, Tom Rodriguez wrote: > >>> I'd *love* for intermediate static Java snippits like this
Re: Good news, bad news
On May 25, 2011, at 5:58 AM, Ola Bini wrote: > Hi, > > There are at least three problems that are still there. They might be > connected, or not. > (I will tell you how to reproduce these at the end) > > I just built a new JVM: > openjdk version "1.7.0-internal" > OpenJDK Runtime Environment (build > 1.7.0-internal-olabini_2011_05_25_08_06-b00) > OpenJDK Server VM (build 21.0-b09, mixed mode) > > I get that weird missing class error when running my test suite: > java.lang.NoClassDefFoundError: seph/lang/SephObject > java.lang.RuntimeException: java.lang.NoClassDefFoundError: > seph/lang/SephObject > at seph.lang.Runtime.evaluateStream(Runtime.java:132) > at seph.lang.Runtime.evaluateString(Runtime.java:146) > at > seph.lang.code.BasicSanityTest.recursive_odd_and_even_that_should_blow_the_stack(BasicSanityTest.java:214) > Caused by: java.lang.NoClassDefFoundError: seph/lang/SephObject > at java.lang.invoke.MethodHandle.invokeExact(MethodHandle.java) > at java.lang.invoke.MethodHandle.invokeExact(MethodHandle.java) > at seph$gen$abstraction$41$even?.argument_0_0(:7) > at seph.lang.compiler.Bootstrap.intrinsic_if(Bootstrap.java:654) > at seph$gen$abstraction$41$even?.even?(:7) > at seph$gen$abstraction$39$toplevel.toplevel(:25) > at seph.lang.Runtime.evaluateStream(Runtime.java:125) > > This is obviously a complete blocker. > > Turning off the ricochet frames hits an NYI error when running the test > suite. > > I'm still seeing a crash in ricochet frames: > # > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGBUS (0xa) at pc=0x0104884f, pid=85043, tid=2953850880 > # > # JRE version: 7.0 > # Java VM: OpenJDK Client VM (21.0-b09 mixed mode bsd-x86 ) > # Problematic frame: > # V [libjvm.dylib+0x4884f] MethodHandles::ricochet_frame_oops_do(frame > const&, OopClosure*, RegisterMap const*)+0x12f > # > # Failed to write core dump. Core dumps have been disabled. To enable > core dumping, try "ulimit -c unlimited" before starting Java again > # > # An error report file with more information is saved as: > # /Users/olabini/workspace/seph/hs_err_pid85043.log > # > # If you would like to submit a bug report, please visit: > # http://bugreport.sun.com/bugreport/crash.jsp > # > > I've attached the error file. > > The third error (in frame.cpp) only happens on master of seph, but there > are some other problems with master that means you get lots of weird output. > > In order to reproduce problem 1 and 2: > git clone git://github.com/seph-lang/seph.git > git checkout 9075c0f4ffe1adac0657057aee2193f16ad12a43 > (build.xml: remove lines with ) > > problem 1: > ant > This should show you one entry of the missing class problem I tried this on solaris-i586 and it works: BUILD SUCCESSFUL Total time: 29 seconds > > problem 2: > ant jar-notest > bin/seph bench/bench_arithmetic.sp > This should show you problem 2. > If you for reference want to see the benchmark run correctly, do > bin/seph -J-Xint bench/bench_arithmetic.sp Works too (with server compiler on b143). > > problem 3: > git checkout master > git checkout 99c8f2609d468835390e39b68c73f21cc78e5ab5 > ant clean jar-notest > bin/seph bench/bench_arithmetic.sp > This should show you problem 3. You will also see loads of other > exceptions, since that point generates slightly bad bytecode. That > shouldn't make the JVM crash though, I assume - and I've seen the > frame.cpp should not reach here without seeing those exceptions. I think I see the exception you are talking about but there is so much output that I'm not sure. I curious about where the problem lies, either it's a HotSpot fix that's not in the mlvm repository yet or it's a JDK bug in one of the mlvm patches which hasn't been pushed yet into the main repository. At least that's my best guess. May I suggest that you guys also try with an official build (like JDK 7 b143) from here (assuming everyone has some kind of access to a Linux box too): http://jdk7.java.net/download.html -- Christian > > All of these require that JAVA_HOME points to the Java 7 you want to use > > Cheers > >> >> tom >> >> On May 23, 2011, at 7:33 PM, Ola Bini wrote: >> >>> Hi, >>> >>> I'm happy to see that the performance degradation is getting >>> addressed. I would like to point out that there is still a serious >>> crash in the machinery too... Have you seen any reason why that >>> happens? >>> >>> Cheers >>> >>> >>> On 2011-05-24 06.11, John Rose wrote: On May 23, 2011, at 3:44 PM, Tom Rodriguez wrote: >> I'd *love* for intermediate static Java snippits like this >> to inline straight through, even if invokeExact would be >> megamorphic under normal circumstances...but I don't think >> that's the case right now, right? > > I haven't been following 292 that closely but it used to be a > piece of code that did precisely that. > > +private Object invoke_L
Re: Good news, bad news
On May 24, 2011, at 10:25 PM, Ola Bini wrote: > I don't use permuteArguments anywhere in my code, actually, but maybe > one of the other combinators is defined in terms of it? No, it's stand-alone. Factoid: All the other combinators are order-stable with respect to arguments. I toyed with using it to normalize complex calling sequences (prim/ref mixes) but never got anywhere. -- John ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
On 2011-05-25 10.50, John Rose wrote: > On May 23, 2011, at 7:33 PM, Ola Bini wrote: > >> Have you seen any reason why that happens? > > There are some known problems (crashers even) with method handle > compilation. > > The permuteArguments transform has a bug when you mix long/doubles > and other argument types. > > I have a crasher in one of my "dispatch pattern" examples that is > similar to some of the code you have been talking about. > > Yes, there is at least one gremlin left in there. > > Thanks for posting the detailed instructions for reproducing your > errors! Thanks! I don't use permuteArguments anywhere in my code, actually, but maybe one of the other combinators is defined in terms of it? Cheers -- Ola Bini (http://olabini.com) Ioke - JRuby - ThoughtWorks "Yields falsehood when quined" yields falsehood when quined. ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
On May 23, 2011, at 7:33 PM, Ola Bini wrote: > Have you seen any reason why that happens? There are some known problems (crashers even) with method handle compilation. The permuteArguments transform has a bug when you mix long/doubles and other argument types. I have a crasher in one of my "dispatch pattern" examples that is similar to some of the code you have been talking about. Yes, there is at least one gremlin left in there. Thanks for posting the detailed instructions for reproducing your errors! -- John ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
On May 23, 2011, at 5:43 PM, Charles Oliver Nutter wrote: > Well I have a prototype GWT in place, but I'm having trouble getting > invokeExact to work. As far as I can tell, incoming arguments should > match the handle type just fine, but I'm getting this error: Hmm... I don't see the bug in your diff. Looks like someone is trying to invoke a MH in varargs style (Object...). The error says "InvokeDynamicSupport.java:934:in `gwt':" but I don't know where this might be. Anyway, I put GWT back the way it was, for (a) low arity and (b) method types without primitives. (For high arity and/or types with primitives, it still uses ricochet frames, for which Tom found an optimization "opportunity".) See if it makes things better for you... -- John___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
Hi, There are at least three problems that are still there. They might be connected, or not. (I will tell you how to reproduce these at the end) I just built a new JVM: openjdk version "1.7.0-internal" OpenJDK Runtime Environment (build 1.7.0-internal-olabini_2011_05_25_08_06-b00) OpenJDK Server VM (build 21.0-b09, mixed mode) I get that weird missing class error when running my test suite: java.lang.NoClassDefFoundError: seph/lang/SephObject java.lang.RuntimeException: java.lang.NoClassDefFoundError: seph/lang/SephObject at seph.lang.Runtime.evaluateStream(Runtime.java:132) at seph.lang.Runtime.evaluateString(Runtime.java:146) at seph.lang.code.BasicSanityTest.recursive_odd_and_even_that_should_blow_the_stack(BasicSanityTest.java:214) Caused by: java.lang.NoClassDefFoundError: seph/lang/SephObject at java.lang.invoke.MethodHandle.invokeExact(MethodHandle.java) at java.lang.invoke.MethodHandle.invokeExact(MethodHandle.java) at seph$gen$abstraction$41$even?.argument_0_0(:7) at seph.lang.compiler.Bootstrap.intrinsic_if(Bootstrap.java:654) at seph$gen$abstraction$41$even?.even?(:7) at seph$gen$abstraction$39$toplevel.toplevel(:25) at seph.lang.Runtime.evaluateStream(Runtime.java:125) This is obviously a complete blocker. Turning off the ricochet frames hits an NYI error when running the test suite. I'm still seeing a crash in ricochet frames: # # A fatal error has been detected by the Java Runtime Environment: # # SIGBUS (0xa) at pc=0x0104884f, pid=85043, tid=2953850880 # # JRE version: 7.0 # Java VM: OpenJDK Client VM (21.0-b09 mixed mode bsd-x86 ) # Problematic frame: # V [libjvm.dylib+0x4884f] MethodHandles::ricochet_frame_oops_do(frame const&, OopClosure*, RegisterMap const*)+0x12f # # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again # # An error report file with more information is saved as: # /Users/olabini/workspace/seph/hs_err_pid85043.log # # If you would like to submit a bug report, please visit: # http://bugreport.sun.com/bugreport/crash.jsp # I've attached the error file. The third error (in frame.cpp) only happens on master of seph, but there are some other problems with master that means you get lots of weird output. In order to reproduce problem 1 and 2: git clone git://github.com/seph-lang/seph.git git checkout 9075c0f4ffe1adac0657057aee2193f16ad12a43 (build.xml: remove lines with ) problem 1: ant This should show you one entry of the missing class problem problem 2: ant jar-notest bin/seph bench/bench_arithmetic.sp This should show you problem 2. If you for reference want to see the benchmark run correctly, do bin/seph -J-Xint bench/bench_arithmetic.sp problem 3: git checkout master git checkout 99c8f2609d468835390e39b68c73f21cc78e5ab5 ant clean jar-notest bin/seph bench/bench_arithmetic.sp This should show you problem 3. You will also see loads of other exceptions, since that point generates slightly bad bytecode. That shouldn't make the JVM crash though, I assume - and I've seen the frame.cpp should not reach here without seeing those exceptions. All of these require that JAVA_HOME points to the Java 7 you want to use Cheers > > tom > > On May 23, 2011, at 7:33 PM, Ola Bini wrote: > >> Hi, >> >> I'm happy to see that the performance degradation is getting >> addressed. I would like to point out that there is still a serious >> crash in the machinery too... Have you seen any reason why that >> happens? >> >> Cheers >> >> >> On 2011-05-24 06.11, John Rose wrote: >>> On May 23, 2011, at 3:44 PM, Tom Rodriguez wrote: >>> > I'd *love* for intermediate static Java snippits like this > to inline straight through, even if invokeExact would be > megamorphic under normal circumstances...but I don't think > that's the case right now, right? I haven't been following 292 that closely but it used to be a piece of code that did precisely that. +private Object invoke_L0() throws Throwable { + if ((boolean) test.invokeExact()) +return target.invokeExact(); +return fallback.invokeExact(); + } It looks like it was reworked at some point to use selectAlternative but the optimizer was never updated to deal with it properly. It's not particularly hard, we just need to generate code like we would for a bimorphic call site. The current code expects to either get a constant or something else and doesn't inline if it's something else. In this case we have a Phi of two constants which we just need to split. >>> >>> Yes, it would be a quasi-bimorphic call site keyed from a phi of >>> two constants, instead of a profile of two receiver classes. >>> I'm still unclear why you couldn't write your own variant of guardWithTest and have it work but my knowledge of what'
Re: Good news, bad news
I haven't been able to reproduce a crash that looks like either of the issues you reported, the ShouldNotReachHere in frame.cpp and the crash in FastScanClosure. Do you have a test that can reproduce it and are you running against the latest version of the code? tom On May 23, 2011, at 7:33 PM, Ola Bini wrote: > Hi, > > I'm happy to see that the performance degradation is getting addressed. > I would like to point out that there is still a serious crash in the > machinery too... Have you seen any reason why that happens? > > Cheers > > > On 2011-05-24 06.11, John Rose wrote: >> On May 23, 2011, at 3:44 PM, Tom Rodriguez wrote: >> I'd *love* for intermediate static Java snippits like this to inline straight through, even if invokeExact would be megamorphic under normal circumstances...but I don't think that's the case right now, right? >>> >>> I haven't been following 292 that closely but it used to be a piece >>> of code that did precisely that. >>> >>> +private Object invoke_L0() throws Throwable { + >>> if ((boolean) test.invokeExact()) +return >>> target.invokeExact(); +return fallback.invokeExact(); + >>> } >>> >>> It looks like it was reworked at some point to use >>> selectAlternative but the optimizer was never updated to deal with >>> it properly. It's not particularly hard, we just need to generate >>> code like we would for a bimorphic call site. The current code >>> expects to either get a constant or something else and doesn't >>> inline if it's something else. In this case we have a Phi of two >>> constants which we just need to split. >> >> Yes, it would be a quasi-bimorphic call site keyed from a phi of two >> constants, instead of a profile of two receiver classes. >> >>> I'm still unclear why you couldn't write your own variant of >>> guardWithTest and have it work but my knowledge of what's really >>> allowed is somewhat limited. >> >> Those snippets will inline (I think), but at some point Charlie will >> want to fetch a bit of constant stuff out of an instance variable. >> That will look non-constant to the optimizer, even if the instance >> variable is final and the enclosing object is a constant reference. >> We made sure this happens for java.lang.invoke classes, but we >> haven't extended it yet to user code, in part because it would have >> its own bug tail to work through. >> >> -- John ___ mlvm-dev >> mailing list mlvm-dev@openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev >> > > > -- > Ola Bini (http://olabini.com) > Ioke - JRuby - ThoughtWorks > > "Yields falsehood when quined" yields falsehood when quined. > > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
Hi, I'm happy to see that the performance degradation is getting addressed. I would like to point out that there is still a serious crash in the machinery too... Have you seen any reason why that happens? Cheers On 2011-05-24 06.11, John Rose wrote: > On May 23, 2011, at 3:44 PM, Tom Rodriguez wrote: > >>> I'd *love* for intermediate static Java snippits like this to >>> inline straight through, even if invokeExact would be megamorphic >>> under normal circumstances...but I don't think that's the case >>> right now, right? >> >> I haven't been following 292 that closely but it used to be a piece >> of code that did precisely that. >> >> +private Object invoke_L0() throws Throwable { + >> if ((boolean) test.invokeExact()) +return >> target.invokeExact(); +return fallback.invokeExact(); + >> } >> >> It looks like it was reworked at some point to use >> selectAlternative but the optimizer was never updated to deal with >> it properly. It's not particularly hard, we just need to generate >> code like we would for a bimorphic call site. The current code >> expects to either get a constant or something else and doesn't >> inline if it's something else. In this case we have a Phi of two >> constants which we just need to split. > > Yes, it would be a quasi-bimorphic call site keyed from a phi of two > constants, instead of a profile of two receiver classes. > >> I'm still unclear why you couldn't write your own variant of >> guardWithTest and have it work but my knowledge of what's really >> allowed is somewhat limited. > > Those snippets will inline (I think), but at some point Charlie will > want to fetch a bit of constant stuff out of an instance variable. > That will look non-constant to the optimizer, even if the instance > variable is final and the enclosing object is a constant reference. > We made sure this happens for java.lang.invoke classes, but we > haven't extended it yet to user code, in part because it would have > its own bug tail to work through. > > -- John ___ mlvm-dev > mailing list mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > -- Ola Bini (http://olabini.com) Ioke - JRuby - ThoughtWorks "Yields falsehood when quined" yields falsehood when quined. ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
On Mon, May 23, 2011 at 7:41 PM, John Rose wrote: >> I'm still unclear why you couldn't write your own variant of guardWithTest >> and have it work but my knowledge of what's really allowed is somewhat >> limited. > > Those snippets will inline (I think), but at some point Charlie will want to > fetch a bit of constant stuff out of an instance variable. That will look > non-constant to the optimizer, even if the instance variable is final and the > enclosing object is a constant reference. We made sure this happens for > java.lang.invoke classes, but we haven't extended it yet to user code, in > part because it would have its own bug tail to work through. That's what I figured. See the diff I just posted that basically does GWT by hand. This also plays into the pre/post logic discussion too...ideally there would be a way to indicate to the JVM that my hand-written Java code should be specialized on a per-callsite or per-MH-chain basis, so I could reuse the same code many places without it going all megamorphic on me. You say "haven't extended it yet to user code"...which gives me hope for 1.7.0_1 potentially introducing it :) If I can get past this invokeExact exception, I'll let you know how inlining looks in a polymorphic case. - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
Well I have a prototype GWT in place, but I'm having trouble getting invokeExact to work. As far as I can tell, incoming arguments should match the handle type just fine, but I'm getting this error: InvokeDynamicSupport.java:934:in `gwt': java.lang.invoke.WrongMethodTypeException: (Lorg/jruby/runtime/ThreadContext;Lorg/jruby/runtime/builtin/IRubyObject;Lorg/jruby/runtime/builtin/IRubyObject;Ljava/lang/String;Lorg/jruby/runtime/builtin/IRubyObject;)Lorg/jruby/runtime/builtin/IRubyObject; cannot be called with a different arity as ([Ljava/lang/Object;)Ljava/lang/Object; The diff as it stands right now is here: https://gist.github.com/987939 - Charlie On Mon, May 23, 2011 at 7:23 PM, Tom Rodriguez wrote: > > On May 23, 2011, at 5:13 PM, Charles Oliver Nutter wrote: > >> On Mon, May 23, 2011 at 6:17 PM, Rémi Forax wrote: >>> On 05/24/2011 12:44 AM, Tom Rodriguez wrote: I haven't been following 292 that closely but it used to be a piece of code that did precisely that. + private Object invoke_L0() throws Throwable { + if ((boolean) test.invokeExact()) + return target.invokeExact(); + return fallback.invokeExact(); + } It looks like it was reworked at some point to use selectAlternative but the optimizer was never updated to deal with it properly. It's not particularly hard, we just need to generate code like we would for a bimorphic call site. The current code expects to either get a constant or something else and doesn't inline if it's something else. In this case we have a Phi of two constants which we just need to split. >> >> I'll give it a shot. I'll put together a benchmark that produces >> "effectively monomorphic" calls at the invokedynamic call site, but >> which uses an intermediate hand-written piece of Java code (that ends >> up polymorphic) in place of GWT. >> >> Obviously GWT should not fail this badly in final release, but if this >> works it could mean many, many other patterns will too. And it will >> definitively answer the question about whether such a case inlines in >> current logic. > > You can also just wait for a fix. I've started writing the logic for it and > just need to duplicate the guard. We should be able to have this fixed for > JDK7. > > tom > >> >>> The main problem of writing your own variant is that you will have to >>> write 256 * (number_of_primitive_type + 1) variants. >>> The main idea of GWT (and all other method handle combiners) is that >>> they provide a way to get such polymorphic variant >>> without coding all combinations. >>> >>> The benefit is also that these combiners are known by the VM that can >>> provide >>> dedicated path that are more efficient, >>> by ex, the arguments of GWT is pushed only once and used by the test >>> *and* the target (or fallback) >>> or better guarantee about how trees of combiners are inlined. >> >> I am not currently specializing any call paths for primitives, but >> that will come soon. For experiment's sake, I only need to provide one >> such GWT replacement to test fib (no block passed, single IRubyObject >> argument) and one more for the dispatch bench (no block, no args). >> We'll see how it looks. >> >> - Charlie >> ___ >> mlvm-dev mailing list >> mlvm-dev@openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
On May 23, 2011, at 3:44 PM, Tom Rodriguez wrote: >> I'd *love* for intermediate static Java snippits like this to inline >> straight through, even if invokeExact would be megamorphic under >> normal circumstances...but I don't think that's the case right now, >> right? > > I haven't been following 292 that closely but it used to be a piece of code > that did precisely that. > > +private Object invoke_L0() throws Throwable { > +if ((boolean) test.invokeExact()) > +return target.invokeExact(); > +return fallback.invokeExact(); > +} > > It looks like it was reworked at some point to use selectAlternative but the > optimizer was never updated to deal with it properly. It's not particularly > hard, we just need to generate code like we would for a bimorphic call site. > The current code expects to either get a constant or something else and > doesn't inline if it's something else. In this case we have a Phi of two > constants which we just need to split. Yes, it would be a quasi-bimorphic call site keyed from a phi of two constants, instead of a profile of two receiver classes. > I'm still unclear why you couldn't write your own variant of guardWithTest > and have it work but my knowledge of what's really allowed is somewhat > limited. Those snippets will inline (I think), but at some point Charlie will want to fetch a bit of constant stuff out of an instance variable. That will look non-constant to the optimizer, even if the instance variable is final and the enclosing object is a constant reference. We made sure this happens for java.lang.invoke classes, but we haven't extended it yet to user code, in part because it would have its own bug tail to work through. -- John ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
On May 23, 2011, at 5:13 PM, Charles Oliver Nutter wrote: > On Mon, May 23, 2011 at 6:17 PM, Rémi Forax wrote: >> On 05/24/2011 12:44 AM, Tom Rodriguez wrote: >>> I haven't been following 292 that closely but it used to be a piece of code >>> that did precisely that. >>> >>> +private Object invoke_L0() throws Throwable { >>> +if ((boolean) test.invokeExact()) >>> +return target.invokeExact(); >>> +return fallback.invokeExact(); >>> +} >>> >>> It looks like it was reworked at some point to use selectAlternative but >>> the optimizer was never updated to deal with it properly. It's not >>> particularly hard, we just need to generate code like we would for a >>> bimorphic call site. The current code expects to either get a constant or >>> something else and doesn't inline if it's something else. In this case we >>> have a Phi of two constants which we just need to split. > > I'll give it a shot. I'll put together a benchmark that produces > "effectively monomorphic" calls at the invokedynamic call site, but > which uses an intermediate hand-written piece of Java code (that ends > up polymorphic) in place of GWT. > > Obviously GWT should not fail this badly in final release, but if this > works it could mean many, many other patterns will too. And it will > definitively answer the question about whether such a case inlines in > current logic. You can also just wait for a fix. I've started writing the logic for it and just need to duplicate the guard. We should be able to have this fixed for JDK7. tom > >> The main problem of writing your own variant is that you will have to >> write 256 * (number_of_primitive_type + 1) variants. >> The main idea of GWT (and all other method handle combiners) is that >> they provide a way to get such polymorphic variant >> without coding all combinations. >> >> The benefit is also that these combiners are known by the VM that can >> provide >> dedicated path that are more efficient, >> by ex, the arguments of GWT is pushed only once and used by the test >> *and* the target (or fallback) >> or better guarantee about how trees of combiners are inlined. > > I am not currently specializing any call paths for primitives, but > that will come soon. For experiment's sake, I only need to provide one > such GWT replacement to test fib (no block passed, single IRubyObject > argument) and one more for the dispatch bench (no block, no args). > We'll see how it looks. > > - Charlie > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
On Mon, May 23, 2011 at 6:17 PM, Rémi Forax wrote: > On 05/24/2011 12:44 AM, Tom Rodriguez wrote: >> I haven't been following 292 that closely but it used to be a piece of code >> that did precisely that. >> >> + private Object invoke_L0() throws Throwable { >> + if ((boolean) test.invokeExact()) >> + return target.invokeExact(); >> + return fallback.invokeExact(); >> + } >> >> It looks like it was reworked at some point to use selectAlternative but the >> optimizer was never updated to deal with it properly. It's not particularly >> hard, we just need to generate code like we would for a bimorphic call site. >> The current code expects to either get a constant or something else and >> doesn't inline if it's something else. In this case we have a Phi of two >> constants which we just need to split. I'll give it a shot. I'll put together a benchmark that produces "effectively monomorphic" calls at the invokedynamic call site, but which uses an intermediate hand-written piece of Java code (that ends up polymorphic) in place of GWT. Obviously GWT should not fail this badly in final release, but if this works it could mean many, many other patterns will too. And it will definitively answer the question about whether such a case inlines in current logic. > The main problem of writing your own variant is that you will have to > write 256 * (number_of_primitive_type + 1) variants. > The main idea of GWT (and all other method handle combiners) is that > they provide a way to get such polymorphic variant > without coding all combinations. > > The benefit is also that these combiners are known by the VM that can > provide > dedicated path that are more efficient, > by ex, the arguments of GWT is pushed only once and used by the test > *and* the target (or fallback) > or better guarantee about how trees of combiners are inlined. I am not currently specializing any call paths for primitives, but that will come soon. For experiment's sake, I only need to provide one such GWT replacement to test fib (no block passed, single IRubyObject argument) and one more for the dispatch bench (no block, no args). We'll see how it looks. - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
On 05/24/2011 12:44 AM, Tom Rodriguez wrote: > On May 23, 2011, at 3:27 PM, Charles Oliver Nutter wrote: > >> On Mon, May 23, 2011 at 5:20 PM, Tom Rodriguez >> wrote: >>> On May 23, 2011, at 2:50 PM, Charles Oliver Nutter wrote: I also do not see any indication *why* nothing inlines past this point. Usually it would say "too big" or something. >>> If there's no message then it wasn't even a candidate for inlining, either >>> because the path was trimmed away by the optimizer or there wasn't enough >>> information to bind the callsite to a particular method. It's probably the >>> latter in this case. >> Good to know I'm reading this output right :) Is there another logging >> flag that will log these situations too? I'd like to have that not >> just for invokedynamic opto. >> >>> Actually my guess is that the selectAlternative machinery is what's >>> screwing it up. That code gets emitted as: >>> >>> (test ? a : b).invokeExact(); >>> >>> instead of >>> >>> if (test) a.invokeExact(); else b.invokeExact(); >>> >>> Since the first one only uses a single call site we can't statically inline >>> even if a and b are constants. The optimizer could do that optimization >>> but it currently doesn't. Can you build the guard differently? We'll need >>> to build a new call site optimization to deal with the selectAlternative >>> logic. >> Is this a question for me? If so... >> >> Build the guard differently as in don't use >> MethodHandles.guardWithTest? I probably could, but it would require >> inserting "normal" Java code into the call path, which should in >> theory be megamorphic (since all invokedynamic everywhere will go >> through that same piece of code). Something like: >> >> IRubyObject myGuardWithTest(MethodHandle dmh, args...) { >> if (test(args) { >> return dmh.invokeExact(args); >> } else { >> inline cache >> } >> } >> >> I'd *love* for intermediate static Java snippits like this to inline >> straight through, even if invokeExact would be megamorphic under >> normal circumstances...but I don't think that's the case right now, >> right? > I haven't been following 292 that closely but it used to be a piece of code > that did precisely that. > > +private Object invoke_L0() throws Throwable { > +if ((boolean) test.invokeExact()) > +return target.invokeExact(); > +return fallback.invokeExact(); > +} > > It looks like it was reworked at some point to use selectAlternative but the > optimizer was never updated to deal with it properly. It's not particularly > hard, we just need to generate code like we would for a bimorphic call site. > The current code expects to either get a constant or something else and > doesn't inline if it's something else. In this case we have a Phi of two > constants which we just need to split. > > I'm still unclear why you couldn't write your own variant of guardWithTest > and have it work but my knowledge of what's really allowed is somewhat > limited. The main problem of writing your own variant is that you will have to write 256 * (number_of_primitive_type + 1) variants. The main idea of GWT (and all other method handle combiners) is that they provide a way to get such polymorphic variant without coding all combinations. The benefit is also that these combiners are known by the VM that can provide dedicated path that are more efficient, by ex, the arguments of GWT is pushed only once and used by the test *and* the target (or fallback) or better guarantee about how trees of combiners are inlined. > tom Rémi ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
On May 23, 2011, at 3:27 PM, Charles Oliver Nutter wrote: > On Mon, May 23, 2011 at 5:20 PM, Tom Rodriguez > wrote: >> >> On May 23, 2011, at 2:50 PM, Charles Oliver Nutter wrote: >>> I also do not see any indication *why* nothing inlines past this >>> point. Usually it would say "too big" or something. >> >> If there's no message then it wasn't even a candidate for inlining, either >> because the path was trimmed away by the optimizer or there wasn't enough >> information to bind the callsite to a particular method. It's probably the >> latter in this case. > > Good to know I'm reading this output right :) Is there another logging > flag that will log these situations too? I'd like to have that not > just for invokedynamic opto. > >> Actually my guess is that the selectAlternative machinery is what's screwing >> it up. That code gets emitted as: >> >> (test ? a : b).invokeExact(); >> >> instead of >> >> if (test) a.invokeExact(); else b.invokeExact(); >> >> Since the first one only uses a single call site we can't statically inline >> even if a and b are constants. The optimizer could do that optimization but >> it currently doesn't. Can you build the guard differently? We'll need to >> build a new call site optimization to deal with the selectAlternative logic. > > Is this a question for me? If so... > > Build the guard differently as in don't use > MethodHandles.guardWithTest? I probably could, but it would require > inserting "normal" Java code into the call path, which should in > theory be megamorphic (since all invokedynamic everywhere will go > through that same piece of code). Something like: > > IRubyObject myGuardWithTest(MethodHandle dmh, args...) { > if (test(args) { >return dmh.invokeExact(args); > } else { >inline cache > } > } > > I'd *love* for intermediate static Java snippits like this to inline > straight through, even if invokeExact would be megamorphic under > normal circumstances...but I don't think that's the case right now, > right? I haven't been following 292 that closely but it used to be a piece of code that did precisely that. +private Object invoke_L0() throws Throwable { +if ((boolean) test.invokeExact()) +return target.invokeExact(); +return fallback.invokeExact(); +} It looks like it was reworked at some point to use selectAlternative but the optimizer was never updated to deal with it properly. It's not particularly hard, we just need to generate code like we would for a bimorphic call site. The current code expects to either get a constant or something else and doesn't inline if it's something else. In this case we have a Phi of two constants which we just need to split. I'm still unclear why you couldn't write your own variant of guardWithTest and have it work but my knowledge of what's really allowed is somewhat limited. tom > > - Charlie > >> >> tom >> >>> >>> I do see MinusCallSite inline earlier. >>> >>> - Charlie >>> >>> On Mon, May 23, 2011 at 4:47 PM, Charles Oliver Nutter >>> wrote: The following chunk should be the invokedynamic call to fib, via a GWT, an arg permuter, and perhaps one convert: @ 77 java.lang.invoke.MethodHandle::invokeExact (0 bytes) @ 77 java.lang.invoke.MethodHandle::invokeExact (44 bytes) @ 8 java.lang.invoke.MethodHandle::invokeExact (0 bytes) @ 8 java.lang.invoke.MethodHandle::invokeExact (7 bytes) @ 3 org.jruby.runtime.invokedynamic.InvokeDynamicSupport::test (20 bytes) @ 5 org.jruby.RubyBasicObject::getMetaClass (5 bytes) @ 8 org.jruby.RubyModule::getCacheToken (5 bytes) @ 23 java.lang.invoke.MethodHandle::invokeExact (0 bytes) @ 23 java.lang.invoke.MethodHandle::invokeExact (67 bytes) @ 1 java.lang.Boolean::valueOf (14 bytes) @ 10 java.lang.invoke.MethodHandle::invokeExact (0 bytes) @ 10 java.lang.invoke.MethodHandle::invokeExact (24 bytes) @ 11 java.lang.Boolean::booleanValue (5 bytes) @ 20 java.lang.invoke.MethodHandleImpl::selectAlternative (10 bytes) @ 63 java.lang.invoke.MethodHandle::invokeExact (0 bytes) @ 37 sun.invoke.util.ValueConversions::identity (2 bytes) This seems to only be the test logic; the actual fib invocation doesn't appear to show up in the inlining graph at all. Am I right? I see two of these in the LogCompilation output and nothing else around them. I'd expect to see them do the invocation of fib_ruby somewhere in there. It's like the "success" branch of GWT is not even being considered for inlining. - Charlie On Mon, May 23, 2011 at 4:41 PM, Tom Rodriguez wrote: > If there were to be a recursive inline in there, where would it occur? I > can't tell from the names where in that inline tree where the recursive
Re: Good news, bad news
> On Mon, May 23, 2011 at 5:20 PM, Tom Rodriguez > wrote: >> Actually my guess is that the selectAlternative machinery is what's screwing >> it up. That code gets emitted as: Same benchmark but instead of binding a direct handle via a GWT, I force it to fail over to an inline cache immediately (still behind invokedynamic): ~/projects/jruby ➔ jruby --server -X+C -Xinvokedynamic.maxfail=0 bench/language/bench_method_dispatch_only.rb Test ruby method: 1000k loops calling self's foo 10 times 0.515000 0.00 0.515000 ( 0.403000) 0.174000 0.00 0.174000 ( 0.174000) 0.136000 0.00 0.136000 ( 0.136000) 0.138000 0.00 0.138000 ( 0.138000) 0.164000 0.00 0.164000 ( 0.164000) 0.156000 0.00 0.156000 ( 0.157000) 0.15 0.00 0.15 ( 0.15) 0.145000 0.00 0.145000 ( 0.145000) 0.13 0.00 0.13 ( 0.13) 0.124000 0.00 0.124000 ( 0.124000) "maxfail" here indicates how many GWT it will attempt to chain for a polymorphic call site before bailing out to a simple inline caching target. That inline caching target is bound *directly* into the invokedynamic call site, with no GWT in front of it. And as expected, performance is basically the same as JRuby with inline caching only. So it seems GWT may be the culprit. Inlining graph for the above run, showing that the inline cache logic ("fail" methods in InvokeDynamicSupport in JRuby) all inline into the invokedynamic call site: 1% bench.language.bench_method_dispatch_only::method__1$RUBY$invoking @ 18 (177 bytes) @ 23 java.lang.invoke.MethodHandle::invokeExact (0 bytes) @ 23 java.lang.invoke.MethodHandle::invokeExact (13 bytes) @ 6 org.jruby.runtime.invokedynamic.InvokeDynamicSupport::fail (97 bytes) @ 2 org.jruby.runtime.invokedynamic.InvokeDynamicSupport::pollAndGetClass (13 bytes) @ 1 org.jruby.runtime.ThreadContext::callThreadPoll (23 bytes) @ 19 org.jruby.runtime.ThreadContext::pollThreadEvents executed < MinInliningThreshold times @ 5 org.jruby.runtime.builtin.IRubyObject::getMetaClass (0 bytes) type profile org/jruby/runtime/builtin/IRubyObject -> org/jruby/RubyObject (100%) @ 5 org.jruby.RubyBasicObject::getMetaClass (5 bytes) @ 17 org.jruby.runtime.callsite.CacheEntry::typeOk (6 bytes) @ 2 org.jruby.runtime.callsite.CacheEntry::typeOk (17 bytes) @ 5 org.jruby.RubyModule::getCacheToken (5 bytes) @ 34 org.jruby.internal.runtime.methods.DynamicMethod::call (13 bytes) type profile org/jruby/internal/runtime/methods/DynamicMethod -> bench$language$bench_method_dispatch_only$method__0$RUBY$foo (100%) @ 34 bench$language$bench_method_dispatch_only$method__0$RUBY$foo::call (13 bytes) @ 9 bench$language$bench_method_dispatch_only$method__0$RUBY$foo::call (21 bytes) @ 17 bench.language.bench_method_dispatch_only::method__0$RUBY$foo (2 bytes) @ 42 org.jruby.RubyModule::searchWithCache call site not reached @ 50 org.jruby.runtime.invokedynamic.InvokeDynamicSupport$JRubyCallSite::callType (5 bytes) @ 56 org.jruby.runtime.invokedynamic.InvokeDynamicSupport::methodMissing executed < MinInliningThreshold times @ 65 org.jruby.runtime.invokedynamic.InvokeDynamicSupport$JRubyCallSite::callType (5 bytes) @ 72 org.jruby.runtime.invokedynamic.InvokeDynamicSupport::callMethodMissing never executed @ 93 org.jruby.internal.runtime.methods.DynamicMethod::call (13 bytes) Note that "foo" is shown inlining here as well, but I think that's only because this particular inline-caching call path has only seen one target at this point. - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
On Mon, May 23, 2011 at 5:20 PM, Tom Rodriguez wrote: > > On May 23, 2011, at 2:50 PM, Charles Oliver Nutter wrote: >> I also do not see any indication *why* nothing inlines past this >> point. Usually it would say "too big" or something. > > If there's no message then it wasn't even a candidate for inlining, either > because the path was trimmed away by the optimizer or there wasn't enough > information to bind the callsite to a particular method. It's probably the > latter in this case. Good to know I'm reading this output right :) Is there another logging flag that will log these situations too? I'd like to have that not just for invokedynamic opto. > Actually my guess is that the selectAlternative machinery is what's screwing > it up. That code gets emitted as: > > (test ? a : b).invokeExact(); > > instead of > > if (test) a.invokeExact(); else b.invokeExact(); > > Since the first one only uses a single call site we can't statically inline > even if a and b are constants. The optimizer could do that optimization but > it currently doesn't. Can you build the guard differently? We'll need to > build a new call site optimization to deal with the selectAlternative logic. Is this a question for me? If so... Build the guard differently as in don't use MethodHandles.guardWithTest? I probably could, but it would require inserting "normal" Java code into the call path, which should in theory be megamorphic (since all invokedynamic everywhere will go through that same piece of code). Something like: IRubyObject myGuardWithTest(MethodHandle dmh, args...) { if (test(args) { return dmh.invokeExact(args); } else { inline cache } } I'd *love* for intermediate static Java snippits like this to inline straight through, even if invokeExact would be megamorphic under normal circumstances...but I don't think that's the case right now, right? - Charlie > > tom > >> >> I do see MinusCallSite inline earlier. >> >> - Charlie >> >> On Mon, May 23, 2011 at 4:47 PM, Charles Oliver Nutter >> wrote: >>> The following chunk should be the invokedynamic call to fib, via a >>> GWT, an arg permuter, and perhaps one convert: >>> >>> @ 77 java.lang.invoke.MethodHandle::invokeExact (0 bytes) >>> @ 77 java.lang.invoke.MethodHandle::invokeExact (44 bytes) >>> @ 8 java.lang.invoke.MethodHandle::invokeExact (0 bytes) >>> @ 8 java.lang.invoke.MethodHandle::invokeExact (7 bytes) >>> @ 3 org.jruby.runtime.invokedynamic.InvokeDynamicSupport::test >>> (20 bytes) >>> @ 5 org.jruby.RubyBasicObject::getMetaClass (5 bytes) >>> @ 8 org.jruby.RubyModule::getCacheToken (5 bytes) >>> @ 23 java.lang.invoke.MethodHandle::invokeExact (0 bytes) >>> @ 23 java.lang.invoke.MethodHandle::invokeExact (67 bytes) >>> @ 1 java.lang.Boolean::valueOf (14 bytes) >>> @ 10 java.lang.invoke.MethodHandle::invokeExact (0 bytes) >>> @ 10 java.lang.invoke.MethodHandle::invokeExact (24 bytes) >>> @ 11 java.lang.Boolean::booleanValue (5 bytes) >>> @ 20 java.lang.invoke.MethodHandleImpl::selectAlternative (10 >>> bytes) >>> @ 63 java.lang.invoke.MethodHandle::invokeExact (0 bytes) >>> @ 37 sun.invoke.util.ValueConversions::identity (2 bytes) >>> >>> This seems to only be the test logic; the actual fib invocation >>> doesn't appear to show up in the inlining graph at all. Am I right? >>> >>> I see two of these in the LogCompilation output and nothing else >>> around them. I'd expect to see them do the invocation of fib_ruby >>> somewhere in there. It's like the "success" branch of GWT is not even >>> being considered for inlining. >>> >>> - Charlie >>> >>> On Mon, May 23, 2011 at 4:41 PM, Tom Rodriguez >>> wrote: If there were to be a recursive inline in there, where would it occur? I can't tell from the names where in that inline tree where the recursive call occurs. tom On May 23, 2011, at 2:26 PM, Charles Oliver Nutter wrote: > fib_ruby LogCompilation inlining graph, showing that fib_ruby is not > inlined: https://gist.github.com/f2b665ad3c97ba622ebf > > Can anyone suggest other flags I can try to adjust to get things to > inline better? > > FWIW, the handle chain in question that's not inlining is pretty simple: > > * DMH pointing back at fib_ruby > * permute args > * GWT > > - Charlie > > On Mon, May 23, 2011 at 4:19 PM, Charles Oliver Nutter > wrote: >> I'm working up a set of files that show JRuby compilation output, but >> I noticed a couple things that might be interesting right now. >> >> First off, fairly early in the assembly output for fib, I see this: >> >> 0x02876d1f: call 0x0282d0e0 ; OopMap{[96]=Oop [100]=Oop >> [28]=Oop [40]=Oop [48]=Oop off=644} >> ;*invokespecial invokeExact >> ; - >> java.
Re: Good news, bad news
On May 23, 2011, at 2:50 PM, Charles Oliver Nutter wrote: > Also, fwiw...after these two chunks in LogCompilation output, I see > nothing else inlined into fib_ruby, including a monomorphic call path > through PlusCallSite ending at RubyFixnum#op_plus (the integer + > operation). That would also affect performance. > > I also do not see any indication *why* nothing inlines past this > point. Usually it would say "too big" or something. If there's no message then it wasn't even a candidate for inlining, either because the path was trimmed away by the optimizer or there wasn't enough information to bind the callsite to a particular method. It's probably the latter in this case. Actually my guess is that the selectAlternative machinery is what's screwing it up. That code gets emitted as: (test ? a : b).invokeExact(); instead of if (test) a.invokeExact(); else b.invokeExact(); Since the first one only uses a single call site we can't statically inline even if a and b are constants. The optimizer could do that optimization but it currently doesn't. Can you build the guard differently? We'll need to build a new call site optimization to deal with the selectAlternative logic. tom > > I do see MinusCallSite inline earlier. > > - Charlie > > On Mon, May 23, 2011 at 4:47 PM, Charles Oliver Nutter > wrote: >> The following chunk should be the invokedynamic call to fib, via a >> GWT, an arg permuter, and perhaps one convert: >> >>@ 77 java.lang.invoke.MethodHandle::invokeExact (0 bytes) >>@ 77 java.lang.invoke.MethodHandle::invokeExact (44 bytes) >> @ 8 java.lang.invoke.MethodHandle::invokeExact (0 bytes) >> @ 8 java.lang.invoke.MethodHandle::invokeExact (7 bytes) >>@ 3 org.jruby.runtime.invokedynamic.InvokeDynamicSupport::test >> (20 bytes) >> @ 5 org.jruby.RubyBasicObject::getMetaClass (5 bytes) >> @ 8 org.jruby.RubyModule::getCacheToken (5 bytes) >> @ 23 java.lang.invoke.MethodHandle::invokeExact (0 bytes) >> @ 23 java.lang.invoke.MethodHandle::invokeExact (67 bytes) >>@ 1 java.lang.Boolean::valueOf (14 bytes) >>@ 10 java.lang.invoke.MethodHandle::invokeExact (0 bytes) >>@ 10 java.lang.invoke.MethodHandle::invokeExact (24 bytes) >> @ 11 java.lang.Boolean::booleanValue (5 bytes) >> @ 20 java.lang.invoke.MethodHandleImpl::selectAlternative (10 bytes) >>@ 63 java.lang.invoke.MethodHandle::invokeExact (0 bytes) >> @ 37 sun.invoke.util.ValueConversions::identity (2 bytes) >> >> This seems to only be the test logic; the actual fib invocation >> doesn't appear to show up in the inlining graph at all. Am I right? >> >> I see two of these in the LogCompilation output and nothing else >> around them. I'd expect to see them do the invocation of fib_ruby >> somewhere in there. It's like the "success" branch of GWT is not even >> being considered for inlining. >> >> - Charlie >> >> On Mon, May 23, 2011 at 4:41 PM, Tom Rodriguez >> wrote: >>> If there were to be a recursive inline in there, where would it occur? I >>> can't tell from the names where in that inline tree where the recursive >>> call occurs. >>> >>> tom >>> >>> On May 23, 2011, at 2:26 PM, Charles Oliver Nutter wrote: >>> fib_ruby LogCompilation inlining graph, showing that fib_ruby is not inlined: https://gist.github.com/f2b665ad3c97ba622ebf Can anyone suggest other flags I can try to adjust to get things to inline better? FWIW, the handle chain in question that's not inlining is pretty simple: * DMH pointing back at fib_ruby * permute args * GWT - Charlie On Mon, May 23, 2011 at 4:19 PM, Charles Oliver Nutter wrote: > I'm working up a set of files that show JRuby compilation output, but > I noticed a couple things that might be interesting right now. > > First off, fairly early in the assembly output for fib, I see this: > > 0x02876d1f: call 0x0282d0e0 ; OopMap{[96]=Oop [100]=Oop > [28]=Oop [40]=Oop [48]=Oop off=644} >;*invokespecial invokeExact >; - > java.lang.invoke.MethodHandle::invokeExact@63 >; - > java.lang.invoke.MethodHandle::invokeExact@23 >; - > bench.bench_fib_recursive::method__0$RUBY$fib_ruby@51 (line 7) >; {optimized virtual_call} > > For fib, the only invokedynamic is the recursive call to fib, so that > would indicate that fib_ruby is not inlining into itself at all here. > And I can't see it inlining into itself anywhere in the assembly > output. > > Later in the same output: > > 0x0287703f: call 0x0282dba0 ; OopMap{ebp=Oop off=1444} >;*checkcast >
Re: Good news, bad news
FWIW, perf with indy versus monomorphic inline caching on that bench_method_dispatch_only benchmark: ~/projects/jruby ➔ jruby --server -X+C bench/language/bench_method_dispatch_only.rbTest ruby method: 1000k loops calling self's foo 10 times 1.129000 0.00 1.129000 ( 0.662000) 0.409000 0.00 0.409000 ( 0.409000) 0.455000 0.00 0.455000 ( 0.455000) 0.428000 0.00 0.428000 ( 0.428000) 0.474000 0.00 0.474000 ( 0.474000) 0.47 0.00 0.47 ( 0.47) 0.458000 0.00 0.458000 ( 0.458000) 0.495000 0.00 0.495000 ( 0.495000) 0.46 0.00 0.46 ( 0.46) 0.508000 0.00 0.508000 ( 0.508000) ~/projects/jruby ➔ jruby --server -Xcompile.invokedynamic=false -X+C bench/language/bench_method_dispatch_only.rb Test ruby method: 1000k loops calling self's foo 10 times 0.377000 0.00 0.377000 ( 0.315000) 0.211000 0.00 0.211000 ( 0.207000) 0.132000 0.00 0.132000 ( 0.132000) 0.128000 0.00 0.128000 ( 0.128000) 0.135000 0.00 0.135000 ( 0.135000) 0.14 0.00 0.14 ( 0.14) 0.122000 0.00 0.122000 ( 0.122000) 0.122000 0.00 0.122000 ( 0.122000) 0.122000 0.00 0.122000 ( 0.122000) 0.122000 0.00 0.122000 ( 0.122000) Previously, invokedynamic version clocked in *much* faster than the MIC version...like an order of magnitude faster. - Charlie On Mon, May 23, 2011 at 4:56 PM, Charles Oliver Nutter wrote: > Another example, running bench/language/bench_method_dispatch_only, > which runs a 1m iteration loop that invokes an empty "foo" method five > times: > > https://gist.github.com/9008f94fc677f3fe98e7 > > Note again that it seems like only the test logic and maybe some of > the logic wrapping the foo call inline...the foo calls themselves do > not appear in logc inlining graph at all. > > - Charlie > > On Mon, May 23, 2011 at 4:50 PM, Charles Oliver Nutter > wrote: >> Also, fwiw...after these two chunks in LogCompilation output, I see >> nothing else inlined into fib_ruby, including a monomorphic call path >> through PlusCallSite ending at RubyFixnum#op_plus (the integer + >> operation). That would also affect performance. >> >> I also do not see any indication *why* nothing inlines past this >> point. Usually it would say "too big" or something. >> >> I do see MinusCallSite inline earlier. >> >> - Charlie >> >> On Mon, May 23, 2011 at 4:47 PM, Charles Oliver Nutter >> wrote: >>> The following chunk should be the invokedynamic call to fib, via a >>> GWT, an arg permuter, and perhaps one convert: >>> >>> @ 77 java.lang.invoke.MethodHandle::invokeExact (0 bytes) >>> @ 77 java.lang.invoke.MethodHandle::invokeExact (44 bytes) >>> @ 8 java.lang.invoke.MethodHandle::invokeExact (0 bytes) >>> @ 8 java.lang.invoke.MethodHandle::invokeExact (7 bytes) >>> @ 3 org.jruby.runtime.invokedynamic.InvokeDynamicSupport::test >>> (20 bytes) >>> @ 5 org.jruby.RubyBasicObject::getMetaClass (5 bytes) >>> @ 8 org.jruby.RubyModule::getCacheToken (5 bytes) >>> @ 23 java.lang.invoke.MethodHandle::invokeExact (0 bytes) >>> @ 23 java.lang.invoke.MethodHandle::invokeExact (67 bytes) >>> @ 1 java.lang.Boolean::valueOf (14 bytes) >>> @ 10 java.lang.invoke.MethodHandle::invokeExact (0 bytes) >>> @ 10 java.lang.invoke.MethodHandle::invokeExact (24 bytes) >>> @ 11 java.lang.Boolean::booleanValue (5 bytes) >>> @ 20 java.lang.invoke.MethodHandleImpl::selectAlternative (10 >>> bytes) >>> @ 63 java.lang.invoke.MethodHandle::invokeExact (0 bytes) >>> @ 37 sun.invoke.util.ValueConversions::identity (2 bytes) >>> >>> This seems to only be the test logic; the actual fib invocation >>> doesn't appear to show up in the inlining graph at all. Am I right? >>> >>> I see two of these in the LogCompilation output and nothing else >>> around them. I'd expect to see them do the invocation of fib_ruby >>> somewhere in there. It's like the "success" branch of GWT is not even >>> being considered for inlining. >>> >>> - Charlie >>> >>> On Mon, May 23, 2011 at 4:41 PM, Tom Rodriguez >>> wrote: If there were to be a recursive inline in there, where would it occur? I can't tell from the names where in that inline tree where the recursive call occurs. tom On May 23, 2011, at 2:26 PM, Charles Oliver Nutter wrote: > fib_ruby LogCompilation inlining graph, showing that fib_ruby is not > inlined: https://gist.github.com/f2b665ad3c97ba622ebf > > Can anyone suggest other flags I can try to adjust to get things to > inline better? > > FWIW, the handle chain in question that's not inlining is pretty simple: > > * DMH pointing back at fib_ruby > * permute args > * GWT > > - Charlie > > On Mon, May 23, 2011 at 4:19 PM, Charles Oliver Nutter > wrote: >> I'm wo
Re: Good news, bad news
Another example, running bench/language/bench_method_dispatch_only, which runs a 1m iteration loop that invokes an empty "foo" method five times: https://gist.github.com/9008f94fc677f3fe98e7 Note again that it seems like only the test logic and maybe some of the logic wrapping the foo call inline...the foo calls themselves do not appear in logc inlining graph at all. - Charlie On Mon, May 23, 2011 at 4:50 PM, Charles Oliver Nutter wrote: > Also, fwiw...after these two chunks in LogCompilation output, I see > nothing else inlined into fib_ruby, including a monomorphic call path > through PlusCallSite ending at RubyFixnum#op_plus (the integer + > operation). That would also affect performance. > > I also do not see any indication *why* nothing inlines past this > point. Usually it would say "too big" or something. > > I do see MinusCallSite inline earlier. > > - Charlie > > On Mon, May 23, 2011 at 4:47 PM, Charles Oliver Nutter > wrote: >> The following chunk should be the invokedynamic call to fib, via a >> GWT, an arg permuter, and perhaps one convert: >> >> @ 77 java.lang.invoke.MethodHandle::invokeExact (0 bytes) >> @ 77 java.lang.invoke.MethodHandle::invokeExact (44 bytes) >> @ 8 java.lang.invoke.MethodHandle::invokeExact (0 bytes) >> @ 8 java.lang.invoke.MethodHandle::invokeExact (7 bytes) >> @ 3 org.jruby.runtime.invokedynamic.InvokeDynamicSupport::test >> (20 bytes) >> @ 5 org.jruby.RubyBasicObject::getMetaClass (5 bytes) >> @ 8 org.jruby.RubyModule::getCacheToken (5 bytes) >> @ 23 java.lang.invoke.MethodHandle::invokeExact (0 bytes) >> @ 23 java.lang.invoke.MethodHandle::invokeExact (67 bytes) >> @ 1 java.lang.Boolean::valueOf (14 bytes) >> @ 10 java.lang.invoke.MethodHandle::invokeExact (0 bytes) >> @ 10 java.lang.invoke.MethodHandle::invokeExact (24 bytes) >> @ 11 java.lang.Boolean::booleanValue (5 bytes) >> @ 20 java.lang.invoke.MethodHandleImpl::selectAlternative (10 bytes) >> @ 63 java.lang.invoke.MethodHandle::invokeExact (0 bytes) >> @ 37 sun.invoke.util.ValueConversions::identity (2 bytes) >> >> This seems to only be the test logic; the actual fib invocation >> doesn't appear to show up in the inlining graph at all. Am I right? >> >> I see two of these in the LogCompilation output and nothing else >> around them. I'd expect to see them do the invocation of fib_ruby >> somewhere in there. It's like the "success" branch of GWT is not even >> being considered for inlining. >> >> - Charlie >> >> On Mon, May 23, 2011 at 4:41 PM, Tom Rodriguez >> wrote: >>> If there were to be a recursive inline in there, where would it occur? I >>> can't tell from the names where in that inline tree where the recursive >>> call occurs. >>> >>> tom >>> >>> On May 23, 2011, at 2:26 PM, Charles Oliver Nutter wrote: >>> fib_ruby LogCompilation inlining graph, showing that fib_ruby is not inlined: https://gist.github.com/f2b665ad3c97ba622ebf Can anyone suggest other flags I can try to adjust to get things to inline better? FWIW, the handle chain in question that's not inlining is pretty simple: * DMH pointing back at fib_ruby * permute args * GWT - Charlie On Mon, May 23, 2011 at 4:19 PM, Charles Oliver Nutter wrote: > I'm working up a set of files that show JRuby compilation output, but > I noticed a couple things that might be interesting right now. > > First off, fairly early in the assembly output for fib, I see this: > > 0x02876d1f: call 0x0282d0e0 ; OopMap{[96]=Oop [100]=Oop > [28]=Oop [40]=Oop [48]=Oop off=644} > ;*invokespecial invokeExact > ; - > java.lang.invoke.MethodHandle::invokeExact@63 > ; - > java.lang.invoke.MethodHandle::invokeExact@23 > ; - > bench.bench_fib_recursive::method__0$RUBY$fib_ruby@51 (line 7) > ; {optimized virtual_call} > > For fib, the only invokedynamic is the recursive call to fib, so that > would indicate that fib_ruby is not inlining into itself at all here. > And I can't see it inlining into itself anywhere in the assembly > output. > > Later in the same output: > > 0x0287703f: call 0x0282dba0 ; OopMap{ebp=Oop off=1444} > ;*checkcast > ; - > java.lang.invoke.MethodHandle::invokeExact@40 > ; - > bench.bench_fib_recursive::method__0$RUBY$fib_ruby@82 (line 7) > ; {runtime_call} > 0x02877044: call 0x0105a9d0 ;*checkcast > ; - > java.lang.invoke.M
Re: Good news, bad news
Also, fwiw...after these two chunks in LogCompilation output, I see nothing else inlined into fib_ruby, including a monomorphic call path through PlusCallSite ending at RubyFixnum#op_plus (the integer + operation). That would also affect performance. I also do not see any indication *why* nothing inlines past this point. Usually it would say "too big" or something. I do see MinusCallSite inline earlier. - Charlie On Mon, May 23, 2011 at 4:47 PM, Charles Oliver Nutter wrote: > The following chunk should be the invokedynamic call to fib, via a > GWT, an arg permuter, and perhaps one convert: > > @ 77 java.lang.invoke.MethodHandle::invokeExact (0 bytes) > @ 77 java.lang.invoke.MethodHandle::invokeExact (44 bytes) > @ 8 java.lang.invoke.MethodHandle::invokeExact (0 bytes) > @ 8 java.lang.invoke.MethodHandle::invokeExact (7 bytes) > @ 3 org.jruby.runtime.invokedynamic.InvokeDynamicSupport::test > (20 bytes) > @ 5 org.jruby.RubyBasicObject::getMetaClass (5 bytes) > @ 8 org.jruby.RubyModule::getCacheToken (5 bytes) > @ 23 java.lang.invoke.MethodHandle::invokeExact (0 bytes) > @ 23 java.lang.invoke.MethodHandle::invokeExact (67 bytes) > @ 1 java.lang.Boolean::valueOf (14 bytes) > @ 10 java.lang.invoke.MethodHandle::invokeExact (0 bytes) > @ 10 java.lang.invoke.MethodHandle::invokeExact (24 bytes) > @ 11 java.lang.Boolean::booleanValue (5 bytes) > @ 20 java.lang.invoke.MethodHandleImpl::selectAlternative (10 bytes) > @ 63 java.lang.invoke.MethodHandle::invokeExact (0 bytes) > @ 37 sun.invoke.util.ValueConversions::identity (2 bytes) > > This seems to only be the test logic; the actual fib invocation > doesn't appear to show up in the inlining graph at all. Am I right? > > I see two of these in the LogCompilation output and nothing else > around them. I'd expect to see them do the invocation of fib_ruby > somewhere in there. It's like the "success" branch of GWT is not even > being considered for inlining. > > - Charlie > > On Mon, May 23, 2011 at 4:41 PM, Tom Rodriguez > wrote: >> If there were to be a recursive inline in there, where would it occur? I >> can't tell from the names where in that inline tree where the recursive call >> occurs. >> >> tom >> >> On May 23, 2011, at 2:26 PM, Charles Oliver Nutter wrote: >> >>> fib_ruby LogCompilation inlining graph, showing that fib_ruby is not >>> inlined: https://gist.github.com/f2b665ad3c97ba622ebf >>> >>> Can anyone suggest other flags I can try to adjust to get things to >>> inline better? >>> >>> FWIW, the handle chain in question that's not inlining is pretty simple: >>> >>> * DMH pointing back at fib_ruby >>> * permute args >>> * GWT >>> >>> - Charlie >>> >>> On Mon, May 23, 2011 at 4:19 PM, Charles Oliver Nutter >>> wrote: I'm working up a set of files that show JRuby compilation output, but I noticed a couple things that might be interesting right now. First off, fairly early in the assembly output for fib, I see this: 0x02876d1f: call 0x0282d0e0 ; OopMap{[96]=Oop [100]=Oop [28]=Oop [40]=Oop [48]=Oop off=644} ;*invokespecial invokeExact ; - java.lang.invoke.MethodHandle::invokeExact@63 ; - java.lang.invoke.MethodHandle::invokeExact@23 ; - bench.bench_fib_recursive::method__0$RUBY$fib_ruby@51 (line 7) ; {optimized virtual_call} For fib, the only invokedynamic is the recursive call to fib, so that would indicate that fib_ruby is not inlining into itself at all here. And I can't see it inlining into itself anywhere in the assembly output. Later in the same output: 0x0287703f: call 0x0282dba0 ; OopMap{ebp=Oop off=1444} ;*checkcast ; - java.lang.invoke.MethodHandle::invokeExact@40 ; - bench.bench_fib_recursive::method__0$RUBY$fib_ruby@82 (line 7) ; {runtime_call} 0x02877044: call 0x0105a9d0 ;*checkcast ; - java.lang.invoke.MethodHandle::invokeExact@40 ; - bench.bench_fib_recursive::method__0$RUBY$fib_ruby@82 (line 7) ; {runtime_call} These appear repeatedly near the invokedynamic invocation above. If I'm reading this right, neither the recursive call nor logic involved in that particular handle is inlining. Am I right? Here's the complete assembly dump (i386) for the fib_ruby method: https://gist.github.com/987640 In other news, MaxInlin
Re: Good news, bad news
The following chunk should be the invokedynamic call to fib, via a GWT, an arg permuter, and perhaps one convert: @ 77 java.lang.invoke.MethodHandle::invokeExact (0 bytes) @ 77 java.lang.invoke.MethodHandle::invokeExact (44 bytes) @ 8 java.lang.invoke.MethodHandle::invokeExact (0 bytes) @ 8 java.lang.invoke.MethodHandle::invokeExact (7 bytes) @ 3 org.jruby.runtime.invokedynamic.InvokeDynamicSupport::test (20 bytes) @ 5 org.jruby.RubyBasicObject::getMetaClass (5 bytes) @ 8 org.jruby.RubyModule::getCacheToken (5 bytes) @ 23 java.lang.invoke.MethodHandle::invokeExact (0 bytes) @ 23 java.lang.invoke.MethodHandle::invokeExact (67 bytes) @ 1 java.lang.Boolean::valueOf (14 bytes) @ 10 java.lang.invoke.MethodHandle::invokeExact (0 bytes) @ 10 java.lang.invoke.MethodHandle::invokeExact (24 bytes) @ 11 java.lang.Boolean::booleanValue (5 bytes) @ 20 java.lang.invoke.MethodHandleImpl::selectAlternative (10 bytes) @ 63 java.lang.invoke.MethodHandle::invokeExact (0 bytes) @ 37 sun.invoke.util.ValueConversions::identity (2 bytes) This seems to only be the test logic; the actual fib invocation doesn't appear to show up in the inlining graph at all. Am I right? I see two of these in the LogCompilation output and nothing else around them. I'd expect to see them do the invocation of fib_ruby somewhere in there. It's like the "success" branch of GWT is not even being considered for inlining. - Charlie On Mon, May 23, 2011 at 4:41 PM, Tom Rodriguez wrote: > If there were to be a recursive inline in there, where would it occur? I > can't tell from the names where in that inline tree where the recursive call > occurs. > > tom > > On May 23, 2011, at 2:26 PM, Charles Oliver Nutter wrote: > >> fib_ruby LogCompilation inlining graph, showing that fib_ruby is not >> inlined: https://gist.github.com/f2b665ad3c97ba622ebf >> >> Can anyone suggest other flags I can try to adjust to get things to >> inline better? >> >> FWIW, the handle chain in question that's not inlining is pretty simple: >> >> * DMH pointing back at fib_ruby >> * permute args >> * GWT >> >> - Charlie >> >> On Mon, May 23, 2011 at 4:19 PM, Charles Oliver Nutter >> wrote: >>> I'm working up a set of files that show JRuby compilation output, but >>> I noticed a couple things that might be interesting right now. >>> >>> First off, fairly early in the assembly output for fib, I see this: >>> >>> 0x02876d1f: call 0x0282d0e0 ; OopMap{[96]=Oop [100]=Oop >>> [28]=Oop [40]=Oop [48]=Oop off=644} >>> ;*invokespecial invokeExact >>> ; - >>> java.lang.invoke.MethodHandle::invokeExact@63 >>> ; - >>> java.lang.invoke.MethodHandle::invokeExact@23 >>> ; - >>> bench.bench_fib_recursive::method__0$RUBY$fib_ruby@51 (line 7) >>> ; {optimized virtual_call} >>> >>> For fib, the only invokedynamic is the recursive call to fib, so that >>> would indicate that fib_ruby is not inlining into itself at all here. >>> And I can't see it inlining into itself anywhere in the assembly >>> output. >>> >>> Later in the same output: >>> >>> 0x0287703f: call 0x0282dba0 ; OopMap{ebp=Oop off=1444} >>> ;*checkcast >>> ; - >>> java.lang.invoke.MethodHandle::invokeExact@40 >>> ; - >>> bench.bench_fib_recursive::method__0$RUBY$fib_ruby@82 (line 7) >>> ; {runtime_call} >>> 0x02877044: call 0x0105a9d0 ;*checkcast >>> ; - >>> java.lang.invoke.MethodHandle::invokeExact@40 >>> ; - >>> bench.bench_fib_recursive::method__0$RUBY$fib_ruby@82 (line 7) >>> ; {runtime_call} >>> >>> These appear repeatedly near the invokedynamic invocation above. If >>> I'm reading this right, neither the recursive call nor logic involved >>> in that particular handle is inlining. Am I right? >>> >>> Here's the complete assembly dump (i386) for the fib_ruby method: >>> https://gist.github.com/987640 >>> >>> In other news, MaxInlineSize=150 with InlineSmallCode=3000 does not >>> appear to improve performance. I also tried bumping up >>> MaxRecursiveInlineLevel and MaxInlineLevel with no effect. >>> >>> - Charlie >>> >> ___ >> mlvm-dev mailing list >> mlvm-dev@openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > ___ mlvm-dev mailing list mlvm-dev@openj
Re: Good news, bad news
On May 23, 2011, at 2:26 PM, Charles Oliver Nutter wrote: > * GWT GWT changed, but in a way that should make it easier (not harder) to inline. That could be the problem. The new format (using ricochet frames) is: GWT(p,t,f)(*a) := ( selectAlternative(p(*a), t, f) ).invokeExact(*a) where selectAlternative(z,a,b) := z?a:b Or: GWT(p,t,f) := foldArguments(tailcall, folder) where tailcall := exactInvoker(t.type()) chooser := selectAlternative.insertArguments(1,t,f) folder := filterReturnValue(p, chooser) The new MHW code should allow the foldArguments node to be inlined. But that may be the weakest link. -- John ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
If there were to be a recursive inline in there, where would it occur? I can't tell from the names where in that inline tree where the recursive call occurs. tom On May 23, 2011, at 2:26 PM, Charles Oliver Nutter wrote: > fib_ruby LogCompilation inlining graph, showing that fib_ruby is not > inlined: https://gist.github.com/f2b665ad3c97ba622ebf > > Can anyone suggest other flags I can try to adjust to get things to > inline better? > > FWIW, the handle chain in question that's not inlining is pretty simple: > > * DMH pointing back at fib_ruby > * permute args > * GWT > > - Charlie > > On Mon, May 23, 2011 at 4:19 PM, Charles Oliver Nutter > wrote: >> I'm working up a set of files that show JRuby compilation output, but >> I noticed a couple things that might be interesting right now. >> >> First off, fairly early in the assembly output for fib, I see this: >> >> 0x02876d1f: call 0x0282d0e0 ; OopMap{[96]=Oop [100]=Oop >> [28]=Oop [40]=Oop [48]=Oop off=644} >>;*invokespecial invokeExact >>; - >> java.lang.invoke.MethodHandle::invokeExact@63 >>; - >> java.lang.invoke.MethodHandle::invokeExact@23 >>; - >> bench.bench_fib_recursive::method__0$RUBY$fib_ruby@51 (line 7) >>; {optimized virtual_call} >> >> For fib, the only invokedynamic is the recursive call to fib, so that >> would indicate that fib_ruby is not inlining into itself at all here. >> And I can't see it inlining into itself anywhere in the assembly >> output. >> >> Later in the same output: >> >> 0x0287703f: call 0x0282dba0 ; OopMap{ebp=Oop off=1444} >>;*checkcast >>; - >> java.lang.invoke.MethodHandle::invokeExact@40 >>; - >> bench.bench_fib_recursive::method__0$RUBY$fib_ruby@82 (line 7) >>; {runtime_call} >> 0x02877044: call 0x0105a9d0 ;*checkcast >>; - >> java.lang.invoke.MethodHandle::invokeExact@40 >>; - >> bench.bench_fib_recursive::method__0$RUBY$fib_ruby@82 (line 7) >>; {runtime_call} >> >> These appear repeatedly near the invokedynamic invocation above. If >> I'm reading this right, neither the recursive call nor logic involved >> in that particular handle is inlining. Am I right? >> >> Here's the complete assembly dump (i386) for the fib_ruby method: >> https://gist.github.com/987640 >> >> In other news, MaxInlineSize=150 with InlineSmallCode=3000 does not >> appear to improve performance. I also tried bumping up >> MaxRecursiveInlineLevel and MaxInlineLevel with no effect. >> >> - Charlie >> > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
fib_ruby LogCompilation inlining graph, showing that fib_ruby is not inlined: https://gist.github.com/f2b665ad3c97ba622ebf Can anyone suggest other flags I can try to adjust to get things to inline better? FWIW, the handle chain in question that's not inlining is pretty simple: * DMH pointing back at fib_ruby * permute args * GWT - Charlie On Mon, May 23, 2011 at 4:19 PM, Charles Oliver Nutter wrote: > I'm working up a set of files that show JRuby compilation output, but > I noticed a couple things that might be interesting right now. > > First off, fairly early in the assembly output for fib, I see this: > > 0x02876d1f: call 0x0282d0e0 ; OopMap{[96]=Oop [100]=Oop > [28]=Oop [40]=Oop [48]=Oop off=644} > ;*invokespecial invokeExact > ; - > java.lang.invoke.MethodHandle::invokeExact@63 > ; - > java.lang.invoke.MethodHandle::invokeExact@23 > ; - > bench.bench_fib_recursive::method__0$RUBY$fib_ruby@51 (line 7) > ; {optimized virtual_call} > > For fib, the only invokedynamic is the recursive call to fib, so that > would indicate that fib_ruby is not inlining into itself at all here. > And I can't see it inlining into itself anywhere in the assembly > output. > > Later in the same output: > > 0x0287703f: call 0x0282dba0 ; OopMap{ebp=Oop off=1444} > ;*checkcast > ; - > java.lang.invoke.MethodHandle::invokeExact@40 > ; - > bench.bench_fib_recursive::method__0$RUBY$fib_ruby@82 (line 7) > ; {runtime_call} > 0x02877044: call 0x0105a9d0 ;*checkcast > ; - > java.lang.invoke.MethodHandle::invokeExact@40 > ; - > bench.bench_fib_recursive::method__0$RUBY$fib_ruby@82 (line 7) > ; {runtime_call} > > These appear repeatedly near the invokedynamic invocation above. If > I'm reading this right, neither the recursive call nor logic involved > in that particular handle is inlining. Am I right? > > Here's the complete assembly dump (i386) for the fib_ruby method: > https://gist.github.com/987640 > > In other news, MaxInlineSize=150 with InlineSmallCode=3000 does not > appear to improve performance. I also tried bumping up > MaxRecursiveInlineLevel and MaxInlineLevel with no effect. > > - Charlie > ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
On May 23, 2011, at 2:19 PM, Charles Oliver Nutter wrote: > I'm working up a set of files that show JRuby compilation output, but > I noticed a couple things that might be interesting right now. That looks promising. It seems something got bumped in our inlining logic. -- John ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
I'm working up a set of files that show JRuby compilation output, but I noticed a couple things that might be interesting right now. First off, fairly early in the assembly output for fib, I see this: 0x02876d1f: call 0x0282d0e0 ; OopMap{[96]=Oop [100]=Oop [28]=Oop [40]=Oop [48]=Oop off=644} ;*invokespecial invokeExact ; - java.lang.invoke.MethodHandle::invokeExact@63 ; - java.lang.invoke.MethodHandle::invokeExact@23 ; - bench.bench_fib_recursive::method__0$RUBY$fib_ruby@51 (line 7) ; {optimized virtual_call} For fib, the only invokedynamic is the recursive call to fib, so that would indicate that fib_ruby is not inlining into itself at all here. And I can't see it inlining into itself anywhere in the assembly output. Later in the same output: 0x0287703f: call 0x0282dba0 ; OopMap{ebp=Oop off=1444} ;*checkcast ; - java.lang.invoke.MethodHandle::invokeExact@40 ; - bench.bench_fib_recursive::method__0$RUBY$fib_ruby@82 (line 7) ; {runtime_call} 0x02877044: call 0x0105a9d0 ;*checkcast ; - java.lang.invoke.MethodHandle::invokeExact@40 ; - bench.bench_fib_recursive::method__0$RUBY$fib_ruby@82 (line 7) ; {runtime_call} These appear repeatedly near the invokedynamic invocation above. If I'm reading this right, neither the recursive call nor logic involved in that particular handle is inlining. Am I right? Here's the complete assembly dump (i386) for the fib_ruby method: https://gist.github.com/987640 In other news, MaxInlineSize=150 with InlineSmallCode=3000 does not appear to improve performance. I also tried bumping up MaxRecursiveInlineLevel and MaxInlineLevel with no effect. - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
On May 23, 2011, at 8:45 PM, Charles Oliver Nutter wrote: > On Mon, May 23, 2011 at 11:05 AM, Christian Thalinger > wrote: >> I was on vacation for a couple of days but now I'm back. >> >> I looked at the inlining tree for fib_recursive and I can't see the fib_ruby >> method getting compiled. But instead I see something I've never seen before: >> >> 2949 105 org.jruby.ast.CallOneArgNode::interpret (34 bytes) >> >> or >> >> 2997 108 ! org.jruby.evaluator.ASTInterpreter::INTERPRET_METHOD >> (126 bytes) >> >> Does that mean JRuby doesn't compile these methods but interpret them? > > Hmm, I wonder if there's a problem somewhere preventing JIT now. Run > JRuby with -d -X+C to debug logging and force compilation. I will look > into it today myself. I didn't use -X+C but it should compile without it too, right? And, btw, I used JDK 7 b143. I try again tomorrow. -- Christian > > I did see fib_ruby compile and was able to get assembly dumps out, but > it performed terribly. I'll try to make extra sure everything's > jitting. > > Ahh the joys of a mixed-mode runtime, eh? > > - Charlie > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
On Mon, May 23, 2011 at 2:02 PM, John Rose wrote: > Sometimes you can get your dumps faster by using -XX:-DisplayVMOutput along > with -XX:+PrintAssembly. The output goes to hotspot.log anyway. > > Also, -XX:CompileCommand=print,foo::bar will focus assembly printing on the > method(s) you care about. Ok. I'll try to get full compilation logs and assembly for "fib" today. I have confirmed it is indeed compiling and jitting, but still 2x slower than no indy at all. - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
On May 23, 2011, at 11:45 AM, Charles Oliver Nutter wrote: > I did see fib_ruby compile and was able to get assembly dumps out, but > it performed terribly. I'll try to make extra sure everything's > jitting. > > Ahh the joys of a mixed-mode runtime, eh? Indeed. Anyway, thanks for pushing on this. Sometimes you can get your dumps faster by using -XX:-DisplayVMOutput along with -XX:+PrintAssembly. The output goes to hotspot.log anyway. Also, -XX:CompileCommand=print,foo::bar will focus assembly printing on the method(s) you care about. -- John ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
On Mon, May 23, 2011 at 11:05 AM, Christian Thalinger wrote: > I was on vacation for a couple of days but now I'm back. > > I looked at the inlining tree for fib_recursive and I can't see the fib_ruby > method getting compiled. But instead I see something I've never seen before: > > 2949 105 org.jruby.ast.CallOneArgNode::interpret (34 bytes) > > or > > 2997 108 ! org.jruby.evaluator.ASTInterpreter::INTERPRET_METHOD > (126 bytes) > > Does that mean JRuby doesn't compile these methods but interpret them? Hmm, I wonder if there's a problem somewhere preventing JIT now. Run JRuby with -d -X+C to debug logging and force compilation. I will look into it today myself. I did see fib_ruby compile and was able to get assembly dumps out, but it performed terribly. I'll try to make extra sure everything's jitting. Ahh the joys of a mixed-mode runtime, eh? - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
On May 21, 2011, at 1:31 AM, Charles Oliver Nutter wrote: > I am still using the specialized math callsites because they inline well and > an Indy-based approach was still quite a bit slower last time I tried. > > I am back to being worried about the inlining thresholds. With no MHs and > runtime profiled direct calls, I was able to get this all to inline. So > basically the fact that I am using Indy and MHs now penalizes me over not > just profiled direct calls, but over a totally generic inline cache that > usually doesn't inline at all. That's pretty bad :) I was on vacation for a couple of days but now I'm back. I looked at the inlining tree for fib_recursive and I can't see the fib_ruby method getting compiled. But instead I see something I've never seen before: 2949 105 org.jruby.ast.CallOneArgNode::interpret (34 bytes) or 2997 108 ! org.jruby.evaluator.ASTInterpreter::INTERPRET_METHOD (126 bytes) Does that mean JRuby doesn't compile these methods but interpret them? -- Christian > > I can try fiddling with the thresholds later, but do we agree that an > invokedynamic bound to a GWT plus some intermediate adapter handles ending in > a direct handle should inline as well as e.g. an invokevirtual? > > - Charlie (mobile) > > On May 20, 2011, at 17:35, Rémi Forax wrote: > >> On 05/20/2011 11:08 PM, Tom Rodriguez wrote: >>> Is it the inlining policy? I played a little with this and the current >>> default policy is still not as good as the big tweaks to MaxInlineSize and >>> InlineSmallCode. >>> >>> tom >> >> I tend to agree, >> things like isFixnumReopened and getCacheToken seem to be not inlined. >> >> Charles, do you have tested if the inline flags are tweaked ? >> >> Also I do not understand why you have JRuby's callsites like MinusCallSite >> in the stacktrace. You should not use it if invokedynamic is enabled. >> >> Rémi >> >>> On May 20, 2011, at 1:21 PM, Charles Oliver Nutter wrote: >>> Meh, false alarm. This is just the default empty Object[] assigned to all objects. I still can't explain the perf degradation. Here's the full assembly for fib_ruby: https://gist.github.com/983721 It seems like stuff is inlining, so I have no idea why performance is so terrible. - Charlie On Fri, May 20, 2011 at 2:59 PM, Charles Oliver Nutter wrote: > Another update...I managed to turn off the ricochet frames and have it > still work (???). The with-ricochet mode is faster, but still 2-3x > slower than no indy at all (using JRuby's IC), and 4-5x slower than > invokedynamic was recently... > > HOWEVER...an assembly dump I just did contains some very odd code that > might indicate a problem in my code. Hold the presses on digging into > it on your end until I can explain this. (this = apparently standing > up an instance variable table for Fixnum objects...*definitely* should > not be happening). > > 0x02868f2a: mov [eax+0x1C], ebx ;*invokespecial > ; - > org.jruby.RubyBasicObject::@1 (line 218) > ; - > org.jruby.RubyObject::@2 (line 116) > ; - > org.jruby.RubyNumeric::@2 (line 114) > ; - > org.jruby.RubyInteger::@2 (line 95) > ; - > org.jruby.RubyFixnum::@5 (line 112) > ; - > org.jruby.RubyFixnum::newFixnum@25 (line 173) > ; - > org.jruby.RubyFixnum::op_minus@34 (line 395) > ; - > org.jruby.runtime.callsite.MinusCallSite::call@24 (line 16) > 0x02868f2d: mov ebx, [esp+0x14] > 0x02868f31: mov ecx, [ebx+0xBC] > 0x02868f37: mov edx, eax > 0x02868f39: shr edx, 9 > 0x02868f3c: mov [edx+0x394000], 0x00 ;*putfield varTable > ; - > org.jruby.RubyBasicObject::@8 (line 84) > ; - > org.jruby.RubyObject::@2 (line 116) > ; - > org.jruby.RubyNumeric::@2 (line 114) > ; - > org.jruby.RubyInteger::@2 (line 95) > ; - > org.jruby.RubyFixnum::@5 (line 112) > ; - > org.jruby.RubyFixnum::newFixnum@25 (line 173) > ; - > org.jruby.RubyFixnum::op_minus@34 (line 395) > ; - > org.jruby.runtime.callsite.MinusCallSite::call@24 (line 16) > 0x02868f43: mov edx, a 'java/lang/Class' = > 'org/jruby/RubyBasicObject' > ; {oop(a 'java/lang/Class'
Re: Good news, bad news
Yes we do agree. And thank you for framing up the question so clearly. -- John (on my iPhone) On May 20, 2011, at 4:31 PM, Charles Oliver Nutter wrote: > do we agree that an invokedynamic bound to a GWT plus some intermediate > adapter handles ending in a direct handle should inline as well as e.g. an > invokevirtual? ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
I am still using the specialized math callsites because they inline well and an Indy-based approach was still quite a bit slower last time I tried. I am back to being worried about the inlining thresholds. With no MHs and runtime profiled direct calls, I was able to get this all to inline. So basically the fact that I am using Indy and MHs now penalizes me over not just profiled direct calls, but over a totally generic inline cache that usually doesn't inline at all. That's pretty bad :) I can try fiddling with the thresholds later, but do we agree that an invokedynamic bound to a GWT plus some intermediate adapter handles ending in a direct handle should inline as well as e.g. an invokevirtual? - Charlie (mobile) On May 20, 2011, at 17:35, Rémi Forax wrote: > On 05/20/2011 11:08 PM, Tom Rodriguez wrote: >> Is it the inlining policy? I played a little with this and the current >> default policy is still not as good as the big tweaks to MaxInlineSize and >> InlineSmallCode. >> >> tom > > I tend to agree, > things like isFixnumReopened and getCacheToken seem to be not inlined. > > Charles, do you have tested if the inline flags are tweaked ? > > Also I do not understand why you have JRuby's callsites like MinusCallSite > in the stacktrace. You should not use it if invokedynamic is enabled. > > Rémi > >> On May 20, 2011, at 1:21 PM, Charles Oliver Nutter wrote: >> >>> Meh, false alarm. This is just the default empty Object[] assigned to >>> all objects. I still can't explain the perf degradation. >>> >>> Here's the full assembly for fib_ruby: https://gist.github.com/983721 >>> >>> It seems like stuff is inlining, so I have no idea why performance is >>> so terrible. >>> >>> - Charlie >>> >>> On Fri, May 20, 2011 at 2:59 PM, Charles Oliver Nutter >>> wrote: Another update...I managed to turn off the ricochet frames and have it still work (???). The with-ricochet mode is faster, but still 2-3x slower than no indy at all (using JRuby's IC), and 4-5x slower than invokedynamic was recently... HOWEVER...an assembly dump I just did contains some very odd code that might indicate a problem in my code. Hold the presses on digging into it on your end until I can explain this. (this = apparently standing up an instance variable table for Fixnum objects...*definitely* should not be happening). 0x02868f2a: mov [eax+0x1C], ebx ;*invokespecial ; - org.jruby.RubyBasicObject::@1 (line 218) ; - org.jruby.RubyObject::@2 (line 116) ; - org.jruby.RubyNumeric::@2 (line 114) ; - org.jruby.RubyInteger::@2 (line 95) ; - org.jruby.RubyFixnum::@5 (line 112) ; - org.jruby.RubyFixnum::newFixnum@25 (line 173) ; - org.jruby.RubyFixnum::op_minus@34 (line 395) ; - org.jruby.runtime.callsite.MinusCallSite::call@24 (line 16) 0x02868f2d: mov ebx, [esp+0x14] 0x02868f31: mov ecx, [ebx+0xBC] 0x02868f37: mov edx, eax 0x02868f39: shr edx, 9 0x02868f3c: mov [edx+0x394000], 0x00 ;*putfield varTable ; - org.jruby.RubyBasicObject::@8 (line 84) ; - org.jruby.RubyObject::@2 (line 116) ; - org.jruby.RubyNumeric::@2 (line 114) ; - org.jruby.RubyInteger::@2 (line 95) ; - org.jruby.RubyFixnum::@5 (line 112) ; - org.jruby.RubyFixnum::newFixnum@25 (line 173) ; - org.jruby.RubyFixnum::op_minus@34 (line 395) ; - org.jruby.runtime.callsite.MinusCallSite::call@24 (line 16) 0x02868f43: mov edx, a 'java/lang/Class' = 'org/jruby/RubyBasicObject' ; {oop(a 'java/lang/Class' = 'org/jruby/RubyBasicObject')} 0x02868f48: mov ebx, [edx+0x68] ;*getstatic NULL_OBJECT_ARRAY ; - org.jruby.RubyBasicObject::@5 (line 84) ; - org.jruby.RubyObject::@2 (line 116) ; - org.jruby.RubyNumeric::@2 (line 114) ; - org.jruby.RubyInteger::@2 (line 95) ; - org.jruby.RubyFixnum::@5 (line 112) ; - org.jruby.
Re: Good news, bad news
On 05/20/2011 11:08 PM, Tom Rodriguez wrote: > Is it the inlining policy? I played a little with this and the current > default policy is still not as good as the big tweaks to MaxInlineSize and > InlineSmallCode. > > tom I tend to agree, things like isFixnumReopened and getCacheToken seem to be not inlined. Charles, do you have tested if the inline flags are tweaked ? Also I do not understand why you have JRuby's callsites like MinusCallSite in the stacktrace. You should not use it if invokedynamic is enabled. Rémi > On May 20, 2011, at 1:21 PM, Charles Oliver Nutter wrote: > >> Meh, false alarm. This is just the default empty Object[] assigned to >> all objects. I still can't explain the perf degradation. >> >> Here's the full assembly for fib_ruby: https://gist.github.com/983721 >> >> It seems like stuff is inlining, so I have no idea why performance is >> so terrible. >> >> - Charlie >> >> On Fri, May 20, 2011 at 2:59 PM, Charles Oliver Nutter >> wrote: >>> Another update...I managed to turn off the ricochet frames and have it >>> still work (???). The with-ricochet mode is faster, but still 2-3x >>> slower than no indy at all (using JRuby's IC), and 4-5x slower than >>> invokedynamic was recently... >>> >>> HOWEVER...an assembly dump I just did contains some very odd code that >>> might indicate a problem in my code. Hold the presses on digging into >>> it on your end until I can explain this. (this = apparently standing >>> up an instance variable table for Fixnum objects...*definitely* should >>> not be happening). >>> >>> 0x02868f2a: mov [eax+0x1C], ebx ;*invokespecial >>> ; - >>> org.jruby.RubyBasicObject::@1 (line 218) >>> ; - >>> org.jruby.RubyObject::@2 (line 116) >>> ; - >>> org.jruby.RubyNumeric::@2 (line 114) >>> ; - >>> org.jruby.RubyInteger::@2 (line 95) >>> ; - >>> org.jruby.RubyFixnum::@5 (line 112) >>> ; - >>> org.jruby.RubyFixnum::newFixnum@25 (line 173) >>> ; - >>> org.jruby.RubyFixnum::op_minus@34 (line 395) >>> ; - >>> org.jruby.runtime.callsite.MinusCallSite::call@24 (line 16) >>> 0x02868f2d: mov ebx, [esp+0x14] >>> 0x02868f31: mov ecx, [ebx+0xBC] >>> 0x02868f37: mov edx, eax >>> 0x02868f39: shr edx, 9 >>> 0x02868f3c: mov [edx+0x394000], 0x00 ;*putfield varTable >>> ; - >>> org.jruby.RubyBasicObject::@8 (line 84) >>> ; - >>> org.jruby.RubyObject::@2 (line 116) >>> ; - >>> org.jruby.RubyNumeric::@2 (line 114) >>> ; - >>> org.jruby.RubyInteger::@2 (line 95) >>> ; - >>> org.jruby.RubyFixnum::@5 (line 112) >>> ; - >>> org.jruby.RubyFixnum::newFixnum@25 (line 173) >>> ; - >>> org.jruby.RubyFixnum::op_minus@34 (line 395) >>> ; - >>> org.jruby.runtime.callsite.MinusCallSite::call@24 (line 16) >>> 0x02868f43: mov edx, a 'java/lang/Class' = >>> 'org/jruby/RubyBasicObject' >>> ; {oop(a 'java/lang/Class' = >>> 'org/jruby/RubyBasicObject')} >>> 0x02868f48: mov ebx, [edx+0x68] ;*getstatic NULL_OBJECT_ARRAY >>> ; - >>> org.jruby.RubyBasicObject::@5 (line 84) >>> ; - >>> org.jruby.RubyObject::@2 (line 116) >>> ; - >>> org.jruby.RubyNumeric::@2 (line 114) >>> ; - >>> org.jruby.RubyInteger::@2 (line 95) >>> ; - >>> org.jruby.RubyFixnum::@5 (line 112) >>> ; - >>> org.jruby.RubyFixnum::newFixnum@25 (line 173) >>> ; - >>> org.jruby.RubyFixnum::op_minus@34 (line 395) >>> ; - >>> org.jruby.runtime.callsite.MinusCallSite::call@24 (line 16) >>> 0x02868f4b: mov [eax+0x10], ebx >>> 0x02868f4e: lock add [esp], 0x00 ;*putfield varTable >>> ; - >>> org.jruby.RubyBasicObject::@8 (line 84) >>> ; - >>> org.jruby.RubyObject::@2 (line 116) >>> ; - >>> org.jruby.RubyNumeric::@2 (line 114) >>> ; - >>> org.jruby.RubyInteger::@2 (line 95) >>> ; - >>> org.jruby.RubyFixnum::@5 (line 112) >>> ; - >>> org.jruby.RubyFixnum
Re: Good news, bad news
Is it the inlining policy? I played a little with this and the current default policy is still not as good as the big tweaks to MaxInlineSize and InlineSmallCode. tom On May 20, 2011, at 1:21 PM, Charles Oliver Nutter wrote: > Meh, false alarm. This is just the default empty Object[] assigned to > all objects. I still can't explain the perf degradation. > > Here's the full assembly for fib_ruby: https://gist.github.com/983721 > > It seems like stuff is inlining, so I have no idea why performance is > so terrible. > > - Charlie > > On Fri, May 20, 2011 at 2:59 PM, Charles Oliver Nutter > wrote: >> Another update...I managed to turn off the ricochet frames and have it >> still work (???). The with-ricochet mode is faster, but still 2-3x >> slower than no indy at all (using JRuby's IC), and 4-5x slower than >> invokedynamic was recently... >> >> HOWEVER...an assembly dump I just did contains some very odd code that >> might indicate a problem in my code. Hold the presses on digging into >> it on your end until I can explain this. (this = apparently standing >> up an instance variable table for Fixnum objects...*definitely* should >> not be happening). >> >> 0x02868f2a: mov [eax+0x1C], ebx ;*invokespecial >>; - >> org.jruby.RubyBasicObject::@1 (line 218) >>; - >> org.jruby.RubyObject::@2 (line 116) >>; - >> org.jruby.RubyNumeric::@2 (line 114) >>; - >> org.jruby.RubyInteger::@2 (line 95) >>; - >> org.jruby.RubyFixnum::@5 (line 112) >>; - >> org.jruby.RubyFixnum::newFixnum@25 (line 173) >>; - >> org.jruby.RubyFixnum::op_minus@34 (line 395) >>; - >> org.jruby.runtime.callsite.MinusCallSite::call@24 (line 16) >> 0x02868f2d: mov ebx, [esp+0x14] >> 0x02868f31: mov ecx, [ebx+0xBC] >> 0x02868f37: mov edx, eax >> 0x02868f39: shr edx, 9 >> 0x02868f3c: mov [edx+0x394000], 0x00 ;*putfield varTable >>; - >> org.jruby.RubyBasicObject::@8 (line 84) >>; - >> org.jruby.RubyObject::@2 (line 116) >>; - >> org.jruby.RubyNumeric::@2 (line 114) >>; - >> org.jruby.RubyInteger::@2 (line 95) >>; - >> org.jruby.RubyFixnum::@5 (line 112) >>; - >> org.jruby.RubyFixnum::newFixnum@25 (line 173) >>; - >> org.jruby.RubyFixnum::op_minus@34 (line 395) >>; - >> org.jruby.runtime.callsite.MinusCallSite::call@24 (line 16) >> 0x02868f43: mov edx, a 'java/lang/Class' = 'org/jruby/RubyBasicObject' >>; {oop(a 'java/lang/Class' = >> 'org/jruby/RubyBasicObject')} >> 0x02868f48: mov ebx, [edx+0x68] ;*getstatic NULL_OBJECT_ARRAY >>; - >> org.jruby.RubyBasicObject::@5 (line 84) >>; - >> org.jruby.RubyObject::@2 (line 116) >>; - >> org.jruby.RubyNumeric::@2 (line 114) >>; - >> org.jruby.RubyInteger::@2 (line 95) >>; - >> org.jruby.RubyFixnum::@5 (line 112) >>; - >> org.jruby.RubyFixnum::newFixnum@25 (line 173) >>; - >> org.jruby.RubyFixnum::op_minus@34 (line 395) >>; - >> org.jruby.runtime.callsite.MinusCallSite::call@24 (line 16) >> 0x02868f4b: mov [eax+0x10], ebx >> 0x02868f4e: lock add [esp], 0x00 ;*putfield varTable >>; - >> org.jruby.RubyBasicObject::@8 (line 84) >>; - >> org.jruby.RubyObject::@2 (line 116) >>; - >> org.jruby.RubyNumeric::@2 (line 114) >>; - >> org.jruby.RubyInteger::@2 (line 95) >>; - >> org.jruby.RubyFixnum::@5 (line 112) >>; - >> org.jruby.RubyFixnum::newFixnum@25 (line 173) >>; - >> org.jruby.RubyFixnum::op_minus@34 (line 395) >>; - >> org.jruby.runtime.callsite.MinusCallSite::call@24 (line 16) >> >> On Fri, May 20, 2011 at 8:10 AM, Charles Oliver Nutter >> wrote: >>> On Fri, May 20, 2011 at 1:07 AM, Ola Bini wrote: > Try -XX:+UnlockDiag* -XX:-UseRicochetFrames for differential testing. > > If you can find a microbenchmark which shows diffe
Re: Good news, bad news
Meh, false alarm. This is just the default empty Object[] assigned to all objects. I still can't explain the perf degradation. Here's the full assembly for fib_ruby: https://gist.github.com/983721 It seems like stuff is inlining, so I have no idea why performance is so terrible. - Charlie On Fri, May 20, 2011 at 2:59 PM, Charles Oliver Nutter wrote: > Another update...I managed to turn off the ricochet frames and have it > still work (???). The with-ricochet mode is faster, but still 2-3x > slower than no indy at all (using JRuby's IC), and 4-5x slower than > invokedynamic was recently... > > HOWEVER...an assembly dump I just did contains some very odd code that > might indicate a problem in my code. Hold the presses on digging into > it on your end until I can explain this. (this = apparently standing > up an instance variable table for Fixnum objects...*definitely* should > not be happening). > > 0x02868f2a: mov [eax+0x1C], ebx ;*invokespecial > ; - > org.jruby.RubyBasicObject::@1 (line 218) > ; - > org.jruby.RubyObject::@2 (line 116) > ; - > org.jruby.RubyNumeric::@2 (line 114) > ; - > org.jruby.RubyInteger::@2 (line 95) > ; - > org.jruby.RubyFixnum::@5 (line 112) > ; - > org.jruby.RubyFixnum::newFixnum@25 (line 173) > ; - > org.jruby.RubyFixnum::op_minus@34 (line 395) > ; - > org.jruby.runtime.callsite.MinusCallSite::call@24 (line 16) > 0x02868f2d: mov ebx, [esp+0x14] > 0x02868f31: mov ecx, [ebx+0xBC] > 0x02868f37: mov edx, eax > 0x02868f39: shr edx, 9 > 0x02868f3c: mov [edx+0x394000], 0x00 ;*putfield varTable > ; - > org.jruby.RubyBasicObject::@8 (line 84) > ; - > org.jruby.RubyObject::@2 (line 116) > ; - > org.jruby.RubyNumeric::@2 (line 114) > ; - > org.jruby.RubyInteger::@2 (line 95) > ; - > org.jruby.RubyFixnum::@5 (line 112) > ; - > org.jruby.RubyFixnum::newFixnum@25 (line 173) > ; - > org.jruby.RubyFixnum::op_minus@34 (line 395) > ; - > org.jruby.runtime.callsite.MinusCallSite::call@24 (line 16) > 0x02868f43: mov edx, a 'java/lang/Class' = 'org/jruby/RubyBasicObject' > ; {oop(a 'java/lang/Class' = > 'org/jruby/RubyBasicObject')} > 0x02868f48: mov ebx, [edx+0x68] ;*getstatic NULL_OBJECT_ARRAY > ; - > org.jruby.RubyBasicObject::@5 (line 84) > ; - > org.jruby.RubyObject::@2 (line 116) > ; - > org.jruby.RubyNumeric::@2 (line 114) > ; - > org.jruby.RubyInteger::@2 (line 95) > ; - > org.jruby.RubyFixnum::@5 (line 112) > ; - > org.jruby.RubyFixnum::newFixnum@25 (line 173) > ; - > org.jruby.RubyFixnum::op_minus@34 (line 395) > ; - > org.jruby.runtime.callsite.MinusCallSite::call@24 (line 16) > 0x02868f4b: mov [eax+0x10], ebx > 0x02868f4e: lock add [esp], 0x00 ;*putfield varTable > ; - > org.jruby.RubyBasicObject::@8 (line 84) > ; - > org.jruby.RubyObject::@2 (line 116) > ; - > org.jruby.RubyNumeric::@2 (line 114) > ; - > org.jruby.RubyInteger::@2 (line 95) > ; - > org.jruby.RubyFixnum::@5 (line 112) > ; - > org.jruby.RubyFixnum::newFixnum@25 (line 173) > ; - > org.jruby.RubyFixnum::op_minus@34 (line 395) > ; - > org.jruby.runtime.callsite.MinusCallSite::call@24 (line 16) > > On Fri, May 20, 2011 at 8:10 AM, Charles Oliver Nutter > wrote: >> On Fri, May 20, 2011 at 1:07 AM, Ola Bini wrote: Try -XX:+UnlockDiag* -XX:-UseRicochetFrames for differential testing. If you can find a microbenchmark which shows differential slowdown, we can debug it. >>> >>> When trying the above all my tests fail with this: >>> java.lang.RuntimeException: java.lang.InternalError: NYI >> >> Same here. >> >> Reproducing it is pretty easy... >> >> * Clone JRuby: git clone https://github.com/jruby/jruby.git >> * Build JRuby: cd jruby ; ant >> * Run benchmarks: bin/jruby bench
Re: Good news, bad news
Another update...I managed to turn off the ricochet frames and have it still work (???). The with-ricochet mode is faster, but still 2-3x slower than no indy at all (using JRuby's IC), and 4-5x slower than invokedynamic was recently... HOWEVER...an assembly dump I just did contains some very odd code that might indicate a problem in my code. Hold the presses on digging into it on your end until I can explain this. (this = apparently standing up an instance variable table for Fixnum objects...*definitely* should not be happening). 0x02868f2a: mov [eax+0x1C], ebx ;*invokespecial ; - org.jruby.RubyBasicObject::@1 (line 218) ; - org.jruby.RubyObject::@2 (line 116) ; - org.jruby.RubyNumeric::@2 (line 114) ; - org.jruby.RubyInteger::@2 (line 95) ; - org.jruby.RubyFixnum::@5 (line 112) ; - org.jruby.RubyFixnum::newFixnum@25 (line 173) ; - org.jruby.RubyFixnum::op_minus@34 (line 395) ; - org.jruby.runtime.callsite.MinusCallSite::call@24 (line 16) 0x02868f2d: mov ebx, [esp+0x14] 0x02868f31: mov ecx, [ebx+0xBC] 0x02868f37: mov edx, eax 0x02868f39: shr edx, 9 0x02868f3c: mov [edx+0x394000], 0x00 ;*putfield varTable ; - org.jruby.RubyBasicObject::@8 (line 84) ; - org.jruby.RubyObject::@2 (line 116) ; - org.jruby.RubyNumeric::@2 (line 114) ; - org.jruby.RubyInteger::@2 (line 95) ; - org.jruby.RubyFixnum::@5 (line 112) ; - org.jruby.RubyFixnum::newFixnum@25 (line 173) ; - org.jruby.RubyFixnum::op_minus@34 (line 395) ; - org.jruby.runtime.callsite.MinusCallSite::call@24 (line 16) 0x02868f43: mov edx, a 'java/lang/Class' = 'org/jruby/RubyBasicObject' ; {oop(a 'java/lang/Class' = 'org/jruby/RubyBasicObject')} 0x02868f48: mov ebx, [edx+0x68] ;*getstatic NULL_OBJECT_ARRAY ; - org.jruby.RubyBasicObject::@5 (line 84) ; - org.jruby.RubyObject::@2 (line 116) ; - org.jruby.RubyNumeric::@2 (line 114) ; - org.jruby.RubyInteger::@2 (line 95) ; - org.jruby.RubyFixnum::@5 (line 112) ; - org.jruby.RubyFixnum::newFixnum@25 (line 173) ; - org.jruby.RubyFixnum::op_minus@34 (line 395) ; - org.jruby.runtime.callsite.MinusCallSite::call@24 (line 16) 0x02868f4b: mov [eax+0x10], ebx 0x02868f4e: lock add [esp], 0x00 ;*putfield varTable ; - org.jruby.RubyBasicObject::@8 (line 84) ; - org.jruby.RubyObject::@2 (line 116) ; - org.jruby.RubyNumeric::@2 (line 114) ; - org.jruby.RubyInteger::@2 (line 95) ; - org.jruby.RubyFixnum::@5 (line 112) ; - org.jruby.RubyFixnum::newFixnum@25 (line 173) ; - org.jruby.RubyFixnum::op_minus@34 (line 395) ; - org.jruby.runtime.callsite.MinusCallSite::call@24 (line 16) On Fri, May 20, 2011 at 8:10 AM, Charles Oliver Nutter wrote: > On Fri, May 20, 2011 at 1:07 AM, Ola Bini wrote: >>> Try -XX:+UnlockDiag* -XX:-UseRicochetFrames for differential testing. >>> >>> If you can find a microbenchmark which shows differential slowdown, we can >>> debug it. >> >> When trying the above all my tests fail with this: >> java.lang.RuntimeException: java.lang.InternalError: NYI > > Same here. > > Reproducing it is pretty easy... > > * Clone JRuby: git clone https://github.com/jruby/jruby.git > * Build JRuby: cd jruby ; ant > * Run benchmarks: bin/jruby bench/bench_fib_recursive.rb 10 35 > > There are many benchmarks under the bench dir, almost all of which > show this degradation. You can disable invokedynamic use in JRuby by > passing -Xcompile.invokedynamic=false, as seen here: > > > ~/projects/jruby ➔ jruby -Xcompile.invokedynamic=false > bench/bench_fib_recursive.rb 5 35 > WARNING: Both jruby-complete.jar and jruby.jar are present in the > 'lib' directory. Will use jruby.jar > 9227465 > 1.698000 0.00 1.698000 ( 1.645000) > 9227465 > 1.352000
Re: Good news, bad news
On Fri, May 20, 2011 at 1:07 AM, Ola Bini wrote: >> Try -XX:+UnlockDiag* -XX:-UseRicochetFrames for differential testing. >> >> If you can find a microbenchmark which shows differential slowdown, we can >> debug it. > > When trying the above all my tests fail with this: > java.lang.RuntimeException: java.lang.InternalError: NYI Same here. Reproducing it is pretty easy... * Clone JRuby: git clone https://github.com/jruby/jruby.git * Build JRuby: cd jruby ; ant * Run benchmarks: bin/jruby bench/bench_fib_recursive.rb 10 35 There are many benchmarks under the bench dir, almost all of which show this degradation. You can disable invokedynamic use in JRuby by passing -Xcompile.invokedynamic=false, as seen here: ~/projects/jruby ➔ jruby -Xcompile.invokedynamic=false bench/bench_fib_recursive.rb 5 35 WARNING: Both jruby-complete.jar and jruby.jar are present in the 'lib' directory. Will use jruby.jar 9227465 1.698000 0.00 1.698000 ( 1.645000) 9227465 1.352000 0.00 1.352000 ( 1.352000) 9227465 1.364000 0.00 1.364000 ( 1.364000) 9227465 1.344000 0.00 1.344000 ( 1.344000) 9227465 1.336000 0.00 1.336000 ( 1.336000) ~/projects/jruby ➔ jruby -Xcompile.invokedynamic=true bench/bench_fib_recursive.rb 5 35 WARNING: Both jruby-complete.jar and jruby.jar are present in the 'lib' directory. Will use jruby.jar 9227465 3.182000 0.00 3.182000 ( 3.12) 9227465 3.019000 0.00 3.019000 ( 3.019000) 9227465 3.195000 0.00 3.195000 ( 3.195000) 9227465 3.163000 0.00 3.163000 ( 3.163000) 9227465 3.656000 0.00 3.656000 ( 3.656000) I'm sure something's just broken...this benchmark was easily almost 2x faster than the non-invokedynamic version just a few days ago, and it's now 2-3x slower than non-indy logic. I'm going to be online all day, standing by to help in any way I can. I did do a quick LogCompilation last night and the main thing I noticed was that while it seemed like handles are inlining, the eventual DMH does not inline. That would explain some of the perf degradation, I think...but hopefully not all of it. Whatever you need me to run or whatever information...let me know. ALSO...I got the same error as Ola when running JRuby with a standalone jar file: ~/projects/jruby ➔ java -jar lib/jruby-complete.jar bench/bench_fib_recursive.rb MethodHandle.java:-1:in `invokeExact': java.lang.NoClassDefFoundError: org/jruby/runtime/ThreadContext from MethodHandle.java:-1:in `invokeExact' from bench/bench_fib_recursive.rb:7:in `method__0$RUBY$fib_ruby' from bench/bench_fib_recursive.rb:7:in `method__0$RUBY$fib_ruby' from bench/bench_fib_recursive.rb:7:in `method__0$RUBY$fib_ruby' from bench/bench_fib_recursive.rb:7:in `method__0$RUBY$fib_ruby' As in Ola's case, ThreadContext is a core JRuby class. I do *not* get this error when JRuby is loaded from the bootstrap classloader, so I imagine there's some mix-up with generated bytecode and the classloaders that that bytecode gets loaded into. - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
On 2011-05-20 11.21, John Rose wrote: > On May 19, 2011, at 10:06 PM, Charles Oliver Nutter wrote: > >> I've confimed that the i386 build performs just as poorly. Have not >> investigate at inlining or assembly level yet...and it's late, so I'm >> going to get some sleep. > > It's likely that method handle inlining is failing. > > Try -XX:+UnlockDiag* -XX:-UseRicochetFrames for differential testing. > > If you can find a microbenchmark which shows differential slowdown, we can > debug it. When trying the above all my tests fail with this: java.lang.RuntimeException: java.lang.InternalError: NYI at seph.lang.interpreter.MessageInterpreter.evaluateFully(MessageInterpreter.java:107) at seph.lang.Runtime.evaluateStream(Runtime.java:115) at seph.lang.Runtime.evaluateString(Runtime.java:127) at seph.lang.code.BasicSanityTest.recursive_odd_and_even_that_should_blow_the_stack(BasicSanityTest.java:215) Caused by: java.lang.InternalError: NYI at java.lang.invoke.AdapterMethodHandle.makeReturnConversion(AdapterMethodHandle.java:250) at java.lang.invoke.AdapterMethodHandle.makePairwiseConvert(AdapterMethodHandle.java:206) at java.lang.invoke.MethodHandleImpl.convertArguments(MethodHandleImpl.java:749) at java.lang.invoke.MethodHandleImpl$GuardWithTest.make(MethodHandleImpl.java:926) at java.lang.invoke.MethodHandleImpl.makeGuardWithTest(MethodHandleImpl.java:1066) at java.lang.invoke.MethodHandles.guardWithTest(MethodHandles.java:2072) at java.lang.invoke.SwitchPoint.guardWithTest(SwitchPoint.java:145) at seph.lang.compiler.Bootstrap.initialSetup_intrinsic_if(Bootstrap.java:1298) at seph$gen$abstraction$11.activateWith(Unknown Source) at seph$gen$abstraction$11.activateWith(Unknown Source) at seph.lang.ast.NamedMessage.go(NamedMessage.java:162) at seph.lang.ast.NamedMessage.sendTo(NamedMessage.java:143) at seph.lang.interpreter.MessageInterpreter.evaluateFully(MessageInterpreter.java:92) =( Cheers -- Ola Bini (http://olabini.com) Ioke - JRuby - ThoughtWorks "Yields falsehood when quined" yields falsehood when quined. ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
On May 19, 2011, at 10:06 PM, Charles Oliver Nutter wrote: > I've confimed that the i386 build performs just as poorly. Have not > investigate at inlining or assembly level yet...and it's late, so I'm > going to get some sleep. It's likely that method handle inlining is failing. Try -XX:+UnlockDiag* -XX:-UseRicochetFrames for differential testing. If you can find a microbenchmark which shows differential slowdown, we can debug it. -- John ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
On Thu, May 19, 2011 at 11:25 PM, Charles Oliver Nutter wrote: > Ok, good news first. The recent drop of fixes from Hotspot seem to > have fixed the problems in methodHandleWalk.cpp. Hooray! > > The bad news is that performance seems to have *severely* degrade > since previous builds. I've confimed that the i386 build performs just as poorly. Have not investigate at inlining or assembly level yet...and it's late, so I'm going to get some sleep. - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Good news, bad news
On 2011-05-20 09.55, Charles Oliver Nutter wrote: > Ok, good news first. The recent drop of fixes from Hotspot seem to > have fixed the problems in methodHandleWalk.cpp. Hooray! Sorry, but we still have bad news. Specifically, two failures. First a new ShouldNotReachHere crash: # # A fatal error has been detected by the Java Runtime Environment: # # Internal Error (frame.cpp:1158), pid=8050, tid=2953850880 # Error: ShouldNotReachHere() # # JRE version: 7.0 # Java VM: OpenJDK Client VM (21.0-b09 mixed mode bsd-x86 ) # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again # # An error report file with more information is saved as: # /Users/olabini/workspace/seph/hs_err_pid8050.log # # If you would like to submit a bug report, please visit: # http://bugreport.sun.com/bugreport/crash.jsp # [1]8050 abort seph bench/bench_fib.sp I've attached the crash log And then a bizarre error: [junit] Testcase: recursive_odd_and_even_that_should_blow_the_stack(seph.lang.code.BasicSanityTest): Caused an ERROR [junit] java.lang.NoClassDefFoundError: seph/lang/SephObject [junit] java.lang.RuntimeException: java.lang.NoClassDefFoundError: seph/lang/SephObject [junit] at seph.lang.interpreter.MessageInterpreter.evaluateFully(MessageInterpreter.java:107) [junit] at seph.lang.Runtime.evaluateStream(Runtime.java:115) [junit] at seph.lang.Runtime.evaluateString(Runtime.java:127) [junit] at seph.lang.code.BasicSanityTest.recursive_odd_and_even_that_should_blow_the_stack(BasicSanityTest.java:215) [junit] Caused by: java.lang.NoClassDefFoundError: seph/lang/SephObject [junit] at java.lang.invoke.MethodHandle.invokeExact(MethodHandle.java) [junit] at java.lang.invoke.MethodHandle.invokeExact(MethodHandle.java) [junit] at seph$gen$abstraction$11.argument_0_0(Unknown Source) [junit] at seph.lang.compiler.Bootstrap.intrinsic_if(Bootstrap.java:1288) [junit] at seph$gen$abstraction$11.activateWith(Unknown Source) [junit] at seph.lang.interpreter.MessageInterpreter.evaluateFully(MessageInterpreter.java:97) seph/lang/SephObject is the basic interface for all Seph operations, and is obviously there - especially since all other test cases run fine... -- Ola Bini (http://olabini.com) Ioke - JRuby - ThoughtWorks "Yields falsehood when quined" yields falsehood when quined. # # A fatal error has been detected by the Java Runtime Environment: # # Internal Error (frame.cpp:1158), pid=8050, tid=2953850880 # Error: ShouldNotReachHere() # # JRE version: 7.0 # Java VM: OpenJDK Client VM (21.0-b09 mixed mode bsd-x86 ) # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again # # If you would like to submit a bug report, please visit: # http://bugreport.sun.com/bugreport/crash.jsp # --- T H R E A D --- Current thread (0x02018400): VMThread [stack: 0xb0083000,0xb0103000] [id=-1341116416] Stack: [0xb0083000,0xb0103000], sp=0xb0102430, free space=509k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.dylib+0x7b441] VMError::report_and_die()+0x191 V [libjvm.dylib+0x2d454] report_should_not_reach_here(char const*, int)+0x44 V [libjvm.dylib+0x21dbb3] frame::oops_do_internal(OopClosure*, CodeBlobClosure*, RegisterMap*, bool)+0xa3 V [libjvm.dylib+0x49422d] JavaThread::oops_do(OopClosure*, CodeBlobClosure*)+0x18d V [libjvm.dylib+0x4946ac] Threads::oops_do(OopClosure*, CodeBlobClosure*)+0x3c V [libjvm.dylib+0x43a54c] SharedHeap::process_strong_roots(bool, bool, SharedHeap::ScanningOption, OopClosure*, CodeBlobClosure*, OopsInGenClosure*)+0x1ec V [libjvm.dylib+0x2480b2] GenCollectedHeap::gen_process_strong_roots(int, bool, bool, bool, SharedHeap::ScanningOption, OopsInGenClosure*, bool, OopsInGenClosure*)+0x92 V [libjvm.dylib+0x205f6a] DefNewGeneration::collect(bool, bool, unsigned long, bool)+0x2da V [libjvm.dylib+0x249f2a] GenCollectedHeap::do_collection(bool, bool, unsigned long, bool, int)+0x76a V [libjvm.dylib+0x1b8073] GenCollectorPolicy::satisfy_failed_allocation(unsigned long, bool)+0xa3 V [libjvm.dylib+0x247008] GenCollectedHeap::satisfy_failed_allocation(unsigned long, bool)+0x38 V [libjvm.dylib+0x4d160b] VM_GenCollectForAllocation::doit()+0xeb V [libjvm.dylib+0x80db5] VM_Operation::evaluate()+0x45 V [libjvm.dylib+0x4d34dd] VMThread::evaluate_operation(VM_Operation*)+0x7d V [libjvm.dylib+0x4d3ed2] VMThread::loop()+0x3f2 V [libjvm.dylib+0x4d408c] VMThread::run()+0x8c V [libjvm.dylib+0x3f3cf5] java_start(Thread*)+0xc5 C [libSystem.B.dylib+0x2e7fd] _pthread_start+0x159 C [libSystem.B.dylib+0x2e682] thread_start+0x22 JavaThread 0x00800800 (nid = 2953318400) was being processed Java frames: (J=compiled Java code, j=interpreted, Vv=VM code) v ~RuntimeStub::fa