Re: Latest experiments...happiness and sadness
Hi Charlie, Can you send us a decent link or two once it actually does drop. I'm not much of a Ruby head generally, but would like to see the numbers (and, of course, take a quick look at their testing / benching methodology). Thanks, Ben On Wed, Oct 17, 2012 at 1:53 AM, Charles Oliver Nutter head...@headius.com wrote: Hello all! I've recently been informed that a new Ruby implementation is about to be announced that puts JRuby's numeric perf to shame. Boo hoo. It's not like I expected us to retain the numeric crown since we're still allocating objects for every number in the system, but hopefully we can get that crown back at some point. In an effort to start getting back to indy + perf work (with JRuby 1.7 almost released, finally), I bring you today's benchmark: 50.times { puts Benchmark.measure { f = 20.5; i = 0; while i 200; f += 0.1; f -= 0.1; f += 0.1; f -= 0.1; f += 0.1; f -= 0.1; f += 0.1; f -= 0.1; f += 0.1; f -= 0.1; f += 0.1; f -= 0.1; f += 0.1; f -= 0.1; f += 0.1; f -= 0.1; f += 0.1; f -= 0.1; f += 0.1; f -= 0.1;i += 1; end } } So we have a 2M fixnum loop with ten float adds and ten float subtracts. Other variations of this have more iterations and fewer float operations or put the whole loop inside a times{} block. This version runs in about 0.34s on hotspot-comp + Christian's patches, which beats Java 7 at 0.39s. If I remove some rarely-followed boolean logic in the creation of all Ruby objects (including floats) I can get this down to 0.29s. This is many times faster than almost all the current Ruby implementations. However, this new Ruby impl runs the same code in around 0.1s, so even with everything inlining JRuby + indy + hotspot-comp + patches is still 3x slower. I suspect Float allocation is the main bottleneck here. Here's logc output for one of the adds: @ 251 java.lang.invoke.LambdaForm$MH::linkToCallSite (18 bytes) @ 1 java.lang.invoke.Invokers::getCallSiteTarget (8 bytes) @ 4 java.lang.invoke.MutableCallSite::getTarget (5 bytes) @ 14 java.lang.invoke.MethodHandle::invokeBasic (0 bytes) @ 14 java.lang.invoke.LambdaForm$BMH::reinvoke (32 bytes) @ 13 java.lang.invoke.BoundMethodHandle$Species_LD::reinvokerTarget (8 bytes) @ 28 java.lang.invoke.MethodHandle::invokeBasic (0 bytes) @ 28 java.lang.invoke.LambdaForm$DMH::invokeStatic_LLLD_L (20 bytes) @ 1 java.lang.invoke.DirectMethodHandle::internalMemberName (8 bytes) @ 16 java.lang.invoke.MethodHandle::linkToStatic (0 bytes) @ 16 org.jruby.runtime.invokedynamic.MathLinker::float_op_plus (10 bytes) @ 6 org.jruby.RubyFloat::op_plus (14 bytes) @ 1 org.jruby.RubyBasicObject::getRuntime (8 bytes) @ 1 org.jruby.RubyBasicObject::getMetaClass (5 bytes) @ 4 org.jruby.RubyClass::getClassRuntime (5 bytes) @ 10 org.jruby.RubyFloat::newFloat (10 bytes) @ 6 org.jruby.RubyFloat::init (15 bytes) @ 3 org.jruby.Ruby::getFloat (5 bytes) @ 6 org.jruby.RubyNumeric::init (7 bytes) @ 3 org.jruby.RubyObject::init (7 bytes) @ 3 org.jruby.RubyBasicObject::init (30 bytes) @ 1 java.lang.Object::init (1 bytes) This is *great*. We're getting all paths inlined, and allocation inlines all the way up to Object::init, so in theory escape analysis could get rid of this...RIGHT? WRONG!!! logc appears to be missing some ouput (either the tool or the LogCompilation flag are dropping information). The same block of code from PrintInlining: @ 207 java.lang.invoke.LambdaForm$MH/1942422426::linkToCallSite (18 bytes) inline (hot) @ 1 java.lang.invoke.Invokers::getCallSiteTarget (8 bytes) inline (hot) @ 4 java.lang.invoke.MutableCallSite::getTarget (5 bytes) inline (hot) @ 14 java.lang.invoke.LambdaForm$MH/1896635336::guard (80 bytes) inline (hot) @ 12 java.lang.Class::cast (27 bytes) inline (hot) @ 6 java.lang.Class::isInstance (0 bytes) (intrinsic) @ 17 java.lang.invoke.LambdaForm$BMH/1650319731::reinvoke (30 bytes) inline (hot) @ 13 java.lang.invoke.BoundMethodHandle$Species_LL::reinvokerTarget (8 bytes) inline (hot) @ 26 java.lang.invoke.LambdaForm$DMH/842171382::invokeStatic_LL_I (15 bytes) inline (hot) @ 1 java.lang.invoke.DirectMethodHandle::internalMemberName (8 bytes) inline (hot) @ 11 org.jruby.runtime.invokedynamic.MathLinker::floatTest (20 bytes) inline (hot) @ 8
Re: Latest experiments...happiness and sadness
On 10/17/2012 02:53 AM, Charles Oliver Nutter wrote: Hello all! I've recently been informed that a new Ruby implementation is about to be announced that puts JRuby's numeric perf to shame. Boo hoo. It's not like I expected us to retain the numeric crown since we're still allocating objects for every number in the system, but hopefully we can get that crown back at some point. In an effort to start getting back to indy + perf work (with JRuby 1.7 almost released, finally), I bring you today's benchmark: 50.times { puts Benchmark.measure { f = 20.5; i = 0; while i 200; f += 0.1; f -= 0.1; f += 0.1; f -= 0.1; f += 0.1; f -= 0.1; f += 0.1; f -= 0.1; f += 0.1; f -= 0.1; f += 0.1; f -= 0.1; f += 0.1; f -= 0.1; f += 0.1; f -= 0.1; f += 0.1; f -= 0.1; f += 0.1; f -= 0.1;i += 1; end } } So we have a 2M fixnum loop with ten float adds and ten float subtracts. Other variations of this have more iterations and fewer float operations or put the whole loop inside a times{} block. This version runs in about 0.34s on hotspot-comp + Christian's patches, which beats Java 7 at 0.39s. If I remove some rarely-followed boolean logic in the creation of all Ruby objects (including floats) I can get this down to 0.29s. This is many times faster than almost all the current Ruby implementations. However, this new Ruby impl runs the same code in around 0.1s, so even with everything inlining JRuby + indy + hotspot-comp + patches is still 3x slower. I suspect Float allocation is the main bottleneck here. Here's logc output for one of the adds: @ 251 java.lang.invoke.LambdaForm$MH::linkToCallSite (18 bytes) @ 1 java.lang.invoke.Invokers::getCallSiteTarget (8 bytes) @ 4 java.lang.invoke.MutableCallSite::getTarget (5 bytes) @ 14 java.lang.invoke.MethodHandle::invokeBasic (0 bytes) @ 14 java.lang.invoke.LambdaForm$BMH::reinvoke (32 bytes) @ 13 java.lang.invoke.BoundMethodHandle$Species_LD::reinvokerTarget (8 bytes) @ 28 java.lang.invoke.MethodHandle::invokeBasic (0 bytes) @ 28 java.lang.invoke.LambdaForm$DMH::invokeStatic_LLLD_L (20 bytes) @ 1 java.lang.invoke.DirectMethodHandle::internalMemberName (8 bytes) @ 16 java.lang.invoke.MethodHandle::linkToStatic (0 bytes) @ 16 org.jruby.runtime.invokedynamic.MathLinker::float_op_plus (10 bytes) @ 6 org.jruby.RubyFloat::op_plus (14 bytes) @ 1 org.jruby.RubyBasicObject::getRuntime (8 bytes) @ 1 org.jruby.RubyBasicObject::getMetaClass (5 bytes) @ 4 org.jruby.RubyClass::getClassRuntime (5 bytes) @ 10 org.jruby.RubyFloat::newFloat (10 bytes) @ 6 org.jruby.RubyFloat::init (15 bytes) @ 3 org.jruby.Ruby::getFloat (5 bytes) @ 6 org.jruby.RubyNumeric::init (7 bytes) @ 3 org.jruby.RubyObject::init (7 bytes) @ 3 org.jruby.RubyBasicObject::init (30 bytes) @ 1 java.lang.Object::init (1 bytes) This is *great*. We're getting all paths inlined, and allocation inlines all the way up to Object::init, so in theory escape analysis could get rid of this...RIGHT? WRONG!!! logc appears to be missing some ouput (either the tool or the LogCompilation flag are dropping information). The same block of code from PrintInlining: @ 207 java.lang.invoke.LambdaForm$MH/1942422426::linkToCallSite (18 bytes) inline (hot) @ 1 java.lang.invoke.Invokers::getCallSiteTarget (8 bytes) inline (hot) @ 4 java.lang.invoke.MutableCallSite::getTarget (5 bytes) inline (hot) @ 14 java.lang.invoke.LambdaForm$MH/1896635336::guard (80 bytes) inline (hot) @ 12 java.lang.Class::cast (27 bytes) inline (hot) @ 6 java.lang.Class::isInstance (0 bytes) (intrinsic) @ 17 java.lang.invoke.LambdaForm$BMH/1650319731::reinvoke (30 bytes) inline (hot) @ 13 java.lang.invoke.BoundMethodHandle$Species_LL::reinvokerTarget (8 bytes) inline (hot) @ 26 java.lang.invoke.LambdaForm$DMH/842171382::invokeStatic_LL_I (15 bytes) inline (hot) @ 1 java.lang.invoke.DirectMethodHandle::internalMemberName (8 bytes) inline (hot) @ 11 org.jruby.runtime.invokedynamic.MathLinker::floatTest (20 bytes) inline (hot) @ 8 org.jruby.Ruby::isFloatReopened (5 bytes) inline (hot) @ 50 java.lang.invoke.LambdaForm$DMH/952682386::invokeSpecial__L (20 bytes) inline (hot) @ 1
Re: Latest experiments...happiness and sadness
I will indeed! Just preparing ahead of time for the hype machine to go into overdrive. Regardless of initial speed, there's an incredibly long tail to any Ruby implementation, and new ones won't be useful until months or years after they're first released. - Charlie On Wed, Oct 17, 2012 at 3:03 AM, Ben Evans benjamin.john.ev...@gmail.com wrote: Hi Charlie, Can you send us a decent link or two once it actually does drop. I'm not much of a Ruby head generally, but would like to see the numbers (and, of course, take a quick look at their testing / benching methodology). Thanks, Ben On Wed, Oct 17, 2012 at 1:53 AM, Charles Oliver Nutter head...@headius.com wrote: Hello all! I've recently been informed that a new Ruby implementation is about to be announced that puts JRuby's numeric perf to shame. Boo hoo. It's not like I expected us to retain the numeric crown since we're still allocating objects for every number in the system, but hopefully we can get that crown back at some point. In an effort to start getting back to indy + perf work (with JRuby 1.7 almost released, finally), I bring you today's benchmark: 50.times { puts Benchmark.measure { f = 20.5; i = 0; while i 200; f += 0.1; f -= 0.1; f += 0.1; f -= 0.1; f += 0.1; f -= 0.1; f += 0.1; f -= 0.1; f += 0.1; f -= 0.1; f += 0.1; f -= 0.1; f += 0.1; f -= 0.1; f += 0.1; f -= 0.1; f += 0.1; f -= 0.1; f += 0.1; f -= 0.1;i += 1; end } } So we have a 2M fixnum loop with ten float adds and ten float subtracts. Other variations of this have more iterations and fewer float operations or put the whole loop inside a times{} block. This version runs in about 0.34s on hotspot-comp + Christian's patches, which beats Java 7 at 0.39s. If I remove some rarely-followed boolean logic in the creation of all Ruby objects (including floats) I can get this down to 0.29s. This is many times faster than almost all the current Ruby implementations. However, this new Ruby impl runs the same code in around 0.1s, so even with everything inlining JRuby + indy + hotspot-comp + patches is still 3x slower. I suspect Float allocation is the main bottleneck here. Here's logc output for one of the adds: @ 251 java.lang.invoke.LambdaForm$MH::linkToCallSite (18 bytes) @ 1 java.lang.invoke.Invokers::getCallSiteTarget (8 bytes) @ 4 java.lang.invoke.MutableCallSite::getTarget (5 bytes) @ 14 java.lang.invoke.MethodHandle::invokeBasic (0 bytes) @ 14 java.lang.invoke.LambdaForm$BMH::reinvoke (32 bytes) @ 13 java.lang.invoke.BoundMethodHandle$Species_LD::reinvokerTarget (8 bytes) @ 28 java.lang.invoke.MethodHandle::invokeBasic (0 bytes) @ 28 java.lang.invoke.LambdaForm$DMH::invokeStatic_LLLD_L (20 bytes) @ 1 java.lang.invoke.DirectMethodHandle::internalMemberName (8 bytes) @ 16 java.lang.invoke.MethodHandle::linkToStatic (0 bytes) @ 16 org.jruby.runtime.invokedynamic.MathLinker::float_op_plus (10 bytes) @ 6 org.jruby.RubyFloat::op_plus (14 bytes) @ 1 org.jruby.RubyBasicObject::getRuntime (8 bytes) @ 1 org.jruby.RubyBasicObject::getMetaClass (5 bytes) @ 4 org.jruby.RubyClass::getClassRuntime (5 bytes) @ 10 org.jruby.RubyFloat::newFloat (10 bytes) @ 6 org.jruby.RubyFloat::init (15 bytes) @ 3 org.jruby.Ruby::getFloat (5 bytes) @ 6 org.jruby.RubyNumeric::init (7 bytes) @ 3 org.jruby.RubyObject::init (7 bytes) @ 3 org.jruby.RubyBasicObject::init (30 bytes) @ 1 java.lang.Object::init (1 bytes) This is *great*. We're getting all paths inlined, and allocation inlines all the way up to Object::init, so in theory escape analysis could get rid of this...RIGHT? WRONG!!! logc appears to be missing some ouput (either the tool or the LogCompilation flag are dropping information). The same block of code from PrintInlining: @ 207 java.lang.invoke.LambdaForm$MH/1942422426::linkToCallSite (18 bytes) inline (hot) @ 1 java.lang.invoke.Invokers::getCallSiteTarget (8 bytes) inline (hot) @ 4 java.lang.invoke.MutableCallSite::getTarget (5 bytes) inline (hot) @ 14 java.lang.invoke.LambdaForm$MH/1896635336::guard (80 bytes) inline (hot) @ 12 java.lang.Class::cast (27 bytes) inline (hot) @ 6 java.lang.Class::isInstance (0 bytes) (intrinsic) @ 17 java.lang.invoke.LambdaForm$BMH/1650319731::reinvoke (30 bytes) inline (hot) @ 13 java.lang.invoke.BoundMethodHandle$Species_LL::reinvokerTarget (8 bytes) inline (hot) @ 26
Re: Latest experiments...happiness and sadness
On Wed, Oct 17, 2012 at 2:54 PM, Charles Oliver Nutter head...@headius.com wrote: I will indeed! Just preparing ahead of time for the hype machine to go into overdrive. Regardless of initial speed, there's an incredibly long tail to any Ruby implementation, and new ones won't be useful until months or years after they're first released. It was ever thus. People seem to have this amazing cognitive bias for the behaviour of a paper tiger over the real thing. I've sometimes wondered if it's not a side-effect of the golden-path thinking that many (most?) developers seem to get taught. Ben, quite unable to come up with a decent joke about tigers and long tails at short notice. ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: OS X OpenJDK 8 hotspot-comp + perf patches
This is a product build. I can run a fastdebug build if you need it (and really I need it too, since PrintAssembly is still broken with OpenJDK8). - Charlie On Wed, Oct 17, 2012 at 11:54 AM, Mark Roos mr...@roos.com wrote: Thanks for the build Charles. For my rtalk benchmarks jdk7 10.1 secs jdk812.8 your build10.7 Looking good. Also no evidence of the class not found error. Is this a fast debug build? mark ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: hg: mlvm/mlvm/hotspot: value-obj: first cut
On 10/17/2012 05:23 PM, David Chase wrote: On 2012-10-16, at 5:14 AM, Remi Forax fo...@univ-mlv.fr wrote: Frozen/locked is a runtime property, not a type property so it's harder that that. You have to do a frozen check at the beginning of the method and pray that people will only use it with frozen object and not a not frozen one because in that case, you have to de-optimize. Maybe, you can have two versions on the same method, one with the frozen semantics and one with the boxed one (this is what I have done in JDart). I'm still coming up to speed on this, but I thought that the entire point of having value objects is so that we would have a non-standard interface for all methods dealing with value objects. Complex, boxed, is received as a single pointer to an object with headers and fields. Complex, unboxed, is received as a pair of double. The frozen check is punted to the caller, who in turn may have punted it to his caller, etc, potentially removing the need for all tests. Or did I read this wrong? The only place I see a need for a frozen check is when we are interoperating with legacy code that is not playing the frozen-object game, and that we want to run with complete legacy compatibility. In the case, the slow-and-boxed path also includes a frozen check -- if frozen, unbox the object, and head for the fast path, otherwise, stay slow. From the notes (value-obj.txt) I see: 38 - the reference returned from the (unsafe) marking primitive must be used for all future accesses 39 - any previous references (including the one passed to the marking primitive) must be unused 40 - in practice, this means you must mark an object locked immediately after constructing it So, allocation of a value-object becomes something along the lines of new java/lang/Integer dup iload ... invokespecial java/lang/Integer.init(I)V markingPrimitive But we can't rely on this, hence it is not a true type property. But we could make it be as-if. I think I have to assume some sort of a marker class (implements PermanentlyLockable). A bit in the class header (equivalent to implementing PermanentLyLockable) means you have now two classses, the one with the old semantics and the one with the new semantics. If you can have them both at runtime, you make your inlining cache less efficient, it's a problem I've had with PHP.reboot. Marking the instance seems a better idea. Then in bytecode version N+1, the verifier enforces this for all types implementing PL, and all methods trucking in PL-implementing objects will by default generated unboxed entrypoints. Except when dealing with legacy code, it's as good as a type. 100% of the produced code until now is what you call 'legacy' :) For legacy code, I think we have options. Simplest is just to box at the boundaries, with lazy compilation of boxed versions of PL-handling methods in modern bytecodes. I'm trying to decide if we can do better with flow analysis; I think it has to be non-publishing in the PL types, in addition to the other properties. You have to box and unbox a boundaries and because Java allows overriding, an interface can have two methods one which is implemented with boxing semantics and an other which use the frozen semantics. So you need stub codes in front of method similar to verified/unverified entry points. David Rémi ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Latest experiments...happiness and sadness
On Oct 17, 2012, at 8:33 AM, David Chase david.r.ch...@oracle.com wrote: On 2012-10-16, at 8:53 PM, Charles Oliver Nutter head...@headius.com wrote: So *almost* everything is inlining, but one path (I believe it's the failure path from GWT after talking with Christian) is not reached. Because Hotspot's EA can't do partial EA, any unfollowed paths that would receive the allocated object have to be considered escapes, and so anywhere we're doing guarded logic (either in indy or in Java code, like Fixnum overflow checks) the unfollowed paths prevent EA from happening. Boo-hoo. Thoughts? I'm very new to this (have not even looked at the source code to Hotspot yet), but is it possible to push the allocation/boxing to paths that are believed to be rarely taken? That's what partial EA does. I'm trying to get Vladimir to work on it and it seems I'm successful. -- Chris This is not unlike region-based register allocation, where register allocation is limited to what are believe to be the hot regions, and worry about region exits later -- if necessary, you can always spill there. David ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Latest experiments...happiness and sadness
On Wed, Oct 17, 2012 at 2:07 PM, Christian Thalinger christian.thalin...@oracle.com wrote: On Oct 17, 2012, at 8:33 AM, David Chase david.r.ch...@oracle.com wrote: I'm very new to this (have not even looked at the source code to Hotspot yet), but is it possible to push the allocation/boxing to paths that are believed to be rarely taken? That's what partial EA does. I'm trying to get Vladimir to work on it and it seems I'm successful. I started reading a bit about partial EA last night, specifically looking at how PyPy does it. In PyPy, the JIT treats accesses and calls against an object as acting against a virtual object. I did not see if they actually allocate stack space for this, but my guess is that it's virtual in that the data moves are still unoptimized, unemitted operations in the IR representation. If at some point the code calls a branch that would need to see the actual object, they reconstitute it based on the actual values at that point. The concerns some have brought up about construction seem like non-issues here; if the constructor chain is simple and just does field updates (and doesn't allow the object to escape) then the inlined version of the constructor can be treated as acting against the virtual object (again, perhaps against stack-allocated object, or just represented as object accesses in IR), so it still runs when it's supposed to. The object reconstitution that happens later just copies the current virtual object contents into new memory, and proceeds from there. I know very little about the current EA implementation in Hotspot. Was it designed to be able to eventually support partial EA? - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: hg: mlvm/mlvm/hotspot: value-obj: first cut
On 2012-10-17, at 2:12 PM, Remi Forax fo...@univ-mlv.fr wrote: But we can't rely on this, hence it is not a true type property. But we could make it be as-if. I think I have to assume some sort of a marker class (implements PermanentlyLockable). A bit in the class header (equivalent to implementing PermanentLyLockable) means you have now two classses, the one with the old semantics and the one with the new semantics. If you can have them both at runtime, you make your inlining cache less efficient, it's a problem I've had with PHP.reboot. Marking the instance seems a better idea. I'm not sure I follow this -- if j/l/Integer implements PermanentlyLockable, that's just one class. You end up with possibly two versions of each entrypoint that handle any Plockable, true, but this seems like a necessary consequence of supporting both legacy (boxed-only) and modern (unboxed) implementations of Plockable types. The entrypoints are different interfaces at the machine level; I don't see how you can avoid having two. But many of the entrypoints might be mere stubs/wrappers. I've been trying to figure out (Bharadwaj Yadavilli stopped by, we talked about this) whether the per-instance Plockable bit needs to exist or not. Here are some assumptions I'm working from. If any of these are wrong, that would be useful to know: - we want value types in the future. - we want value types passed and returned in unboxed form - we want value types stored in arrays in unboxed form - we can upcast an array of value-elements to an array of reference-elements - we will sometimes box value types -- Object o = someInteger - we must support legacy code - we can use different compilation strategies for code depending on its bytecode version number. So, a strawman implementation might be the following: Use of values that implement Plockable in modern bytecodes is guaranteed to conform to the various value-friendly restrictions. There's no extra bit, no extra call at allocation. They compile as value types, an occurrence of new-dup-loadargs-init is replaced with running the constructor on the args on local memory. The only exception is when they are upcast to a reference supertype. In legacy bytecodes, none of this happens, it's just like today. Mentions of Plockable types compiled as if they were boxed. Compilation of any method that mentions a Plockable type in its signature depends on legacy/modern. In modern, the default implementation is for unboxed, but a boxed stub is provided (perhaps lazily) for references from legacy code. In legacy, the default implementation is for boxed, but an unboxed stub is provided (perhaps lazily) for references from modern code. Arrays are nasty. In both modern and legacy code, arrays themselves are reference types, but arrays of Plockable elements store the elements as value types. In both modern and legacy code, loads from arrays of a reference type (in legacy code, Plockable is a reference type) with a Plockable subtype call a static factory method of the Plockable type that can create a boxed object given an array address and an index. This can require an element-type check before loads. Stores work in reverse, same assignment of responsibility to a method of the Plockable type. Similarly, field loads/stores across the legacy/modern boundary box/unbox as necessary to obtain expected behavior. Optimizations: In legacy code, use-def webs of Plockable that are free of identity-uses can be unboxed. Inlining of unboxing stubs from modern code might help here. In modern code, identifying use-def webs that connect calls to legacy methods can be boxed, since the value representation will give no savings here. I assume I am missing something, because I think this is simpler than John's proposal. Am I skipping ahead straight to value types too quickly? David ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Latest experiments...happiness and sadness
On 10/17/2012 09:07 PM, Christian Thalinger wrote: On Oct 17, 2012, at 8:33 AM, David Chase david.r.ch...@oracle.com wrote: On 2012-10-16, at 8:53 PM, Charles Oliver Nutter head...@headius.com wrote: So *almost* everything is inlining, but one path (I believe it's the failure path from GWT after talking with Christian) is not reached. Because Hotspot's EA can't do partial EA, any unfollowed paths that would receive the allocated object have to be considered escapes, and so anywhere we're doing guarded logic (either in indy or in Java code, like Fixnum overflow checks) the unfollowed paths prevent EA from happening. Boo-hoo. Thoughts? I'm very new to this (have not even looked at the source code to Hotspot yet), but is it possible to push the allocation/boxing to paths that are believed to be rarely taken? That's what partial EA does. I'm trying to get Vladimir to work on it and it seems I'm successful. Graal also does partial EA, the code is available and readable. -- Chris Rémi ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
hg: mlvm/mlvm/jdk: meth-aclone.patch: point fix for bug reported by Remi
Changeset: d925ea8227c0 Author:jrose Date: 2012-10-17 21:02 -0700 URL: http://hg.openjdk.java.net/mlvm/mlvm/jdk/rev/d925ea8227c0 meth-aclone.patch: point fix for bug reported by Remi + meth-aclone-8001105.patch ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
hg: mlvm/mlvm/jdk: meth-lfi: refactor LF.Template to IBG.CodePattern and do cleanups; also assign some bug numbers
Changeset: 51b63e67f83e Author:jrose Date: 2012-10-17 21:25 -0700 URL: http://hg.openjdk.java.net/mlvm/mlvm/jdk/rev/51b63e67f83e meth-lfi: refactor LF.Template to IBG.CodePattern and do cleanups; also assign some bug numbers + anno-stable-8001107.patch - anno-stable.patch + meth-lfi-8001106.patch - meth-lfi.patch ! meth.patch ! series ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
hg: mlvm/mlvm/hotspot: assign some bug numbers
Changeset: b6f0babd7cf1 Author:jrose Date: 2012-10-17 21:46 -0700 URL: http://hg.openjdk.java.net/mlvm/mlvm/hotspot/rev/b6f0babd7cf1 assign some bug numbers ! anno-stable-8001107.patch anno-stable.patch ! series + value-obj-800.patch + value-obj-800.txt - value-obj.patch - value-obj.txt ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev