Re: Hotspot loves PHP.reboot / stack capturing exception
On Sep 14, 2011, at 3:05 PM, Thomas Wuerthinger wrote: > But one could maybe disable the local variable liveness maps > for the methods that use this functionality (i.e., the extended > exception stack trace or the getLocalArray)? Yes, of course. But this will increase spilling throughout those methods. -- John___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot / stack capturing exception
On 9/14/11 10:20 PM, Tom Rodriguez wrote: > On Sep 14, 2011, at 10:10 AM, Thomas Wuerthinger wrote: > >> On 13.09.2011 00:59, John Rose wrote: >>> This exposes the question of liveness: JVMs routinely nullify >>> non-live variables, but what if the only remaining use is the >>> getLocalArray intrinsic? Shouldn't it count as a weak reference? Or >>> will we allow it to "reanimate" all local values? That would impose a >>> systemic cost on the register allocator, for the whole method. >> I cannot think of a use case where nullifying non-live variables would >> be a problem. > But I don't think the compiler knows which locals are live in this case since > the state is going to passed to some unknown piece of code. Any JVMState > used by getLocalArray would have to treat all locals as live. > > tom Yes, true, I guess the fast-path version of a scripting language method might leave out usages of locals that are necessary in the slow-path version. But one could maybe disable the local variable liveness maps for the methods that use this functionality (i.e., the extended exception stack trace or the getLocalArray)? - thomas ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot / stack capturing exception
On Sep 14, 2011, at 10:10 AM, Thomas Wuerthinger wrote: > On 13.09.2011 00:59, John Rose wrote: >> This exposes the question of liveness: JVMs routinely nullify >> non-live variables, but what if the only remaining use is the >> getLocalArray intrinsic? Shouldn't it count as a weak reference? Or >> will we allow it to "reanimate" all local values? That would impose a >> systemic cost on the register allocator, for the whole method. > I cannot think of a use case where nullifying non-live variables would > be a problem. But I don't think the compiler knows which locals are live in this case since the state is going to passed to some unknown piece of code. Any JVMState used by getLocalArray would have to treat all locals as live. tom > >> Now I want to back up to Thomas' specific suggestion. Instead of >> putting in a catch, suppose we use the "throws" clause of the method >> to control local capture. This is a clever way to have existing >> metadata encode an intention to collect locals "automagically", >> without explicit bytecodes or handlers. It requires an overloading of >> the idea of "throws", so it might be better to use a new attribute or >> annotation. >> >> In any case, it seems to me that magic frame metadata which causes >> fillInStackTrace (of selected Throwable types) to collect local values >> is almost completely equivalent (modulo some simulation overhead) to >> collecting the locals in each affected frame's handler. >> >> I say "almost" because the magic metadata can provide the local >> information in unpopped frames, while the less magic method (which can >> be done today) requires each dying frame to be popped, except perhaps >> for the oldest, in order for the local values (and other state) to be >> collected into the flying exception. > Yes. The "simulation overhead" in terms of additional bytecodes and > local variable liveness is however possibly significant. Also, the > try-catch solution does not work for capturing expression stack values. > The special-exception solution would enable language implementors to > generate more compact (and simpler) bytecode methods (e.g., they could > use the Java expression stack as their own expression stack implementation). > > - thomas > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot / stack capturing exception
>From Thomas (e.g., they could use the Java expression stack as their own expression stack implementation). I believe that this is both used and necessary to create reasonable performance implementation of Smalltalk on the jvm. I do this in Rtalk today mapping the Smalltalk stack to the jvm stack and then use jvmti to get access to the stack when I need it. There are some tradeoffs here with what can be done vs a true Smalltalk stack but I don't see them as that critical( mostly with the manipulations of continuations). It does bring up John's concern that with a debugger available security is more difficult. The Smalltalk I am referencing for my implementation converts the machine stack to objects and back as necessary. Perhaps this approach would give an opportunity to control visibility at a fixed point. I believe this is in line with some of this thread. This brings up what I see as a larger requirement to fulfill the promise of Smalltalk. In Smalltalk everything is an object ( including the stack, heap, vm ...) which can be manipulated from the application side at will. Even to the extent of modifying the vm's behavior or the gc. Most of this is possible if jvmti would be available from the object side. Having a controlled access api would also give a location where visibility and security could be enforced. regards mark___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot / stack capturing exception
On 13.09.2011 00:59, John Rose wrote: > This exposes the question of liveness: JVMs routinely nullify > non-live variables, but what if the only remaining use is the > getLocalArray intrinsic? Shouldn't it count as a weak reference? Or > will we allow it to "reanimate" all local values? That would impose a > systemic cost on the register allocator, for the whole method. I cannot think of a use case where nullifying non-live variables would be a problem. > Now I want to back up to Thomas' specific suggestion. Instead of > putting in a catch, suppose we use the "throws" clause of the method > to control local capture. This is a clever way to have existing > metadata encode an intention to collect locals "automagically", > without explicit bytecodes or handlers. It requires an overloading of > the idea of "throws", so it might be better to use a new attribute or > annotation. > > In any case, it seems to me that magic frame metadata which causes > fillInStackTrace (of selected Throwable types) to collect local values > is almost completely equivalent (modulo some simulation overhead) to > collecting the locals in each affected frame's handler. > > I say "almost" because the magic metadata can provide the local > information in unpopped frames, while the less magic method (which can > be done today) requires each dying frame to be popped, except perhaps > for the oldest, in order for the local values (and other state) to be > collected into the flying exception. Yes. The "simulation overhead" in terms of additional bytecodes and local variable liveness is however possibly significant. Also, the try-catch solution does not work for capturing expression stack values. The special-exception solution would enable language implementors to generate more compact (and simpler) bytecode methods (e.g., they could use the Java expression stack as their own expression stack implementation). - thomas ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot / stack capturing exception
On Sep 12, 2011, at 10:23 AM, Thomas Wuerthinger wrote: > If this special exception class is declared as a checked exception, a method > would itself chose if its stack is exposed or not based on its "throws" > clause. I think in that case the possible exploitations are less than > reflection (because it would not be possible to access data declared > "private", but only the stacks of methods that are explicitly declared > accessible). That this is a first step towards continuation support would be > a nice side effect of this solution. We have explored reified JVM states already; it's interesting but hard to tame: http://hg.openjdk.java.net/mlvm/mlvm/hotspot/file/tip/callcc_old.txt Let's assume (for the moment) that each participating frame will have a handler which can contain special code (or metadata) to collect the required locals of the participating frame. There's a more direct way to solve the current problem, which doesn't require a complex security model, and decouples the local-grabbing hack from exceptions. The idea is to introduce something like the x86 "pusha" instruction to the JVM. Introduce a native-coded intrinsic which returns a Object[] array (or tuple) of all the locals in the *immediate* caller of the intrinsic. (Should be an instruction, maybe, but can be an intrinsic.) static int unity() { int x = 1; String y = "tu"; System.out.println(Arrays.asList(System.getLocalArray())); // might print [1, tu] or [1, null] return x; } This exposes the question of liveness: JVMs routinely nullify non-live variables, but what if the only remaining use is the getLocalArray intrinsic? Shouldn't it count as a weak reference? Or will we allow it to "reanimate" all local values? That would impose a systemic cost on the register allocator, for the whole method. Perhaps an argument to getLocalArray (64-bit bitmask) could be used to select locals explicitly. It's still gross, since you have to teach the register allocator to look at calls to getLocalArray; and what if the argument is non-constant? (This feels like the JVM version of JavaScript eval. BTW, I don't think the exception-based formulation helps clean this up.) The getLocalArray intrinsic could be defined in a way that is self-evidently secure. Just like pusha is a shorthand for a lot of individual pushes, the getLocalArray intrinsic could be defined as a shorthand for a lot of individual data motion instructions. Besides compactness, the advantage of the shorthand would be that it would hint to the system that the variable definitions could be put on the slow path. But all this could be accomplished more simply by just putting the data motion instructions (apush #N etc.) in the slow path, just before a well-crafted invokedynamic instruction. Let the normal profiling supply the required hint about slow paths, and sink the data movement instructions into the deopt. metadata. Now I want to back up to Thomas' specific suggestion. Instead of putting in a catch, suppose we use the "throws" clause of the method to control local capture. This is a clever way to have existing metadata encode an intention to collect locals "automagically", without explicit bytecodes or handlers. It requires an overloading of the idea of "throws", so it might be better to use a new attribute or annotation. In any case, it seems to me that magic frame metadata which causes fillInStackTrace (of selected Throwable types) to collect local values is almost completely equivalent (modulo some simulation overhead) to collecting the locals in each affected frame's handler. I say "almost" because the magic metadata can provide the local information in unpopped frames, while the less magic method (which can be done today) requires each dying frame to be popped, except perhaps for the oldest, in order for the local values (and other state) to be collected into the flying exception. -- John___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot / stack capturing exception
On 09.09.2011 03:00, John Rose wrote: On Sep 8, 2011, at 5:35 PM, Thomas Wuerthinger wrote: The operand stack and locals manipulation in the generated bytecodes must exactly match the manipulations done by the scripting interpreter, but I think that it is possible to align those (although there might be some complexity due to the fact that certain value types require more than 1 stack slot). The safeAdd method could be intrinsified to deoptimize to the Java interpreter. So, in case an overflow occurs, there would be a two-level deoptimization: Java optimized code => Java interpreter (which now actually throws the DeoptimizationException) => Scripting language interpreter. Got it. I had missed the meaning of your phrase "at the JVM-level (in fillInStackTrace)". So the exception has an extra-heavy backtrace. Yes, exactly. The backtrace amounts to a JVM continuation. Is there a way to do data hiding so that the language runtime can only see locals that it has a right to observe? That IMO is the problem with rich backtraces: They provide a very deep punch into the JVM, which can be exploited (like reflection) to break across trust boundaries. I guess one answer is that we could trust language runtimes. I'd rather find a more compartmentalized solution, hence my interest in patterns expressible in current bytecodes. If this special exception class is declared as a checked exception, a method would itself chose if its stack is exposed or not based on its "throws" clause. I think in that case the possible exploitations are less than reflection (because it would not be possible to access data declared "private", but only the stacks of methods that are explicitly declared accessible). That this is a first step towards continuation support would be a nice side effect of this solution. - thomas ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot
I've taken the time to write a small prototype (by hand :( ) see http://www-igm.univ-mlv.fr/~forax/tmp/jsr292-deopt.zip using Fibonacci as John suggest. The idea is to transfer the control from fibo written using ints to fibo written using Object (BigInteger at runtime). If an operation overflow, an exception is thrown and the stack is reconstructed to restart the operation on objects. Moreover, a call can now fail because the result value is not an int anymore, in that case, the new return value is thrown inside an exception (one by thread), again the exception is catched and the control flow is transfered to the version that use objects just after the call. To summarize, you can transfer the control flow, either because an operation fail or because the return value of a call is not an int anymore. public static int fibo(int n) { if (n < 2) return 1; return fibo(n - 1) + fibo(n - 2); } In fibo, we have 5 deoptimization pointcuts, n -1 can overflow, n - 2 can overflow, + can overflow, the first call to fibo can return a big integer, the second call to fibo can return a big integer. Each pointcut is encapsulated in a try catch that jump to a specific exception handler. All exception handlers first push a constant (corresponding to the pointcut number) and then jump to the same code that load all variables that potentially store the stack and do an invokedynamic invokedynamic calls a specific function (I have called it an exit function) which constructed from the fibonacci function but using objects. Depending on the pointcut number, a preamble code jump before an operation to restart it or after a function call to replace the return value stored in the exception. Because fibo is recursive, it can also call a plain old fibonacci function using objects without preamble jump code. You can note that the exit function (and the plain function if the function is recursive) can be generated lazily, only if needed i.e. if an overflow occurs. Now the bench on my laptop for fibo(45), i.e when there is no overflow: $ time java -cp .:classes JavaFibo // classical Java, fibo(45) using ints real0m7.393s user0m7.238s sys0m0.014s $ time java -cp .:classes JavaLongFibo // classical Java, fibo (45) using longs real0m6.021s user0m6.000s sys0m0.017s $ time java -cp .:classes Fibo// ints + overflow detecttion real0m8.263s user0m8.112s sys0m0.015s Not that bad but it's only with 5 pointcuts. For the record, here is the time for fibo(47): $ time java -cp .:classes Fibo real3m21.727s user3m22.882s sys0m0.513s I know that the deoptimization bootstrap method can install a transfer method that is a little faster but I think that either BigInteger are slow or I have made a mistake in the deoptimization code. Anyway, the result value is Ok. Rémi On 09/08/2011 09:47 PM, John Rose wrote: On Sep 8, 2011, at 4:57 AM, Thomas Wuerthinger wrote: Why not the following code pattern? Does it generate too many bytecodes? That's a reasonable alternative. It generates data movement bytecodes O(L * M), where L is the average number of live values at deopt points and M is the number of deopt points. The quadratic exponent on bytecode size bothers me, at least a little. The other pattern pins the live values into a common set of locals, reducing data movement bytecodes (and probably compiled code data movement). Using Remi's trick of an invokedynamic instead of a varargs array, the number of data movement bytecodes can be cut down about 3x (no iconst/aastore). But it's still quadratic. -- John ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot
On Sep 8, 2011, at 5:35 PM, Thomas Wuerthinger wrote: > The operand stack and locals manipulation in the generated bytecodes must > exactly match the manipulations done by the scripting interpreter, but I > think that it is possible to align those (although there might be some > complexity due to the fact that certain value types require more than 1 stack > slot). The safeAdd method could be intrinsified to deoptimize to the Java > interpreter. So, in case an overflow occurs, there would be a two-level > deoptimization: Java optimized code => Java interpreter (which now actually > throws the DeoptimizationException) => Scripting language interpreter. Got it. I had missed the meaning of your phrase "at the JVM-level (in fillInStackTrace)". So the exception has an extra-heavy backtrace. The backtrace amounts to a JVM continuation. Is there a way to do data hiding so that the language runtime can only see locals that it has a right to observe? That IMO is the problem with rich backtraces: They provide a very deep punch into the JVM, which can be exploited (like reflection) to break across trust boundaries. I guess one answer is that we could trust language runtimes. I'd rather find a more compartmentalized solution, hence my interest in patterns expressible in current bytecodes. -- John___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot
On 09.09.2011 01:21, John Rose wrote: On Sep 8, 2011, at 4:06 PM, Thomas Wuerthinger wrote: Here an example for a scripting method that performs a+b and is guessed to not overflow. Your example is simplified by the fact that, in the handler, there is only one possible deopt point. What if there are two? E.g., what if your code is: add(a,b,c) := a+b+c . The example could be any complex scripting function. So let's consider the following method: function sumUp(n) { var sum=0; for (var i=0; iThe two deoptimization points are distinguished by the different Java bci at the calls to safeAdd. For each of those points, there must be a slow-path continuation. While generating the bytecodes for the method sumUpFastPath, one can also generate a mapping from exception-bci=>slowpath-entry-point. Such a slowpath-entry-point could either be an interpreter loop method (started with the current locals and scripting code position) or a lazily generated generic OSR version of the scripting function. So the code could look like: Object sumUpGeneric(int n) { try { return sumUpFastPath(n); } catch(DeoptimizationException e) { StackFrame f = e.getStackFrame("sumUpFastPath"); ScriptingPosition p = map.get(f.bci()); return invokeInterpreter(sumUpMethod, p, f.getLocals(), f.getExpressionStack()); } } The operand stack and locals manipulation in the generated bytecodes must exactly match the manipulations done by the scripting interpreter, but I think that it is possible to align those (although there might be some complexity due to the fact that certain value types require more than 1 stack slot). The safeAdd method could be intrinsified to deoptimize to the Java interpreter. So, in case an overflow occurs, there would be a two-level deoptimization: Java optimized code => Java interpreter (which now actually throws the DeoptimizationException) => Scripting language interpreter. - thomas ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot
If we have coroutine, yes! Remi John Rose wrote: >On Sep 8, 2011, at 3:06 PM, Rémi Forax wrote: > >> but you can get live value unless you allow to insert live values to the >> constant pool >> when linking the class (another old dream). > >Can we make a solution from ClassValue? -- John > > >___ >mlvm-dev mailing list >mlvm-dev@openjdk.java.net >http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot
On Sep 8, 2011, at 3:06 PM, Rémi Forax wrote: > but you can get live value unless you allow to insert live values to the > constant pool > when linking the class (another old dream). Can we make a solution from ClassValue? -- John ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot
On Sep 8, 2011, at 4:06 PM, Thomas Wuerthinger wrote: > Here an example for a scripting method that performs a+b and is guessed to > not overflow. Your example is simplified by the fact that, in the handler, there is only one possible deopt point. What if there are two? E.g., what if your code is: add(a,b,c) := a+b+c .___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot
On 08.09.2011 21:47, John Rose wrote: On Sep 8, 2011, at 4:57 AM, Thomas Wuerthinger wrote: Why not the following code pattern? Does it generate too many bytecodes? That's a reasonable alternative. It generates data movement bytecodes O(L * M), where L is the average number of live values at deopt points and M is the number of deopt points. The quadratic exponent on bytecode size bothers me, at least a little. But with a specialized DeoptimizeException that automatically captures the values of the local variables and operand stack at the JVM-level (in fillInStackTrace), there would not be any data movement bytecodes at all. It would require a 1-1 correspondance between scripting language local variables and Java bytecode local variables, but would both be efficient to generate and performant at run time. The information could be captured for all stack frames between the throwing method and the catching method. Here an example for a scripting method that performs a+b and is guessed to not overflow. Object add(int a, int b) { try { return fastPathAdd(a, b); } catch(DeoptimizationException e) { Integer local1 = e.getTopFrame().getLocalAt(0); Integer local2 = e.getTopFrame().getLocalAt(1); return slowPathAdd(local1, local2); } } int fastPathAdd(int a, int b) { if (canOverflow(a, b)) throw new DeoptimizationException(); return a + b; } Object slowPathAdd(Object a, Object b) { // ... generic add implementation ... } - thomas ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot
On 09/08/2011 01:12 AM, John Rose wrote: > On Sep 7, 2011, at 3:28 PM, Rémi Forax wrote: > >> On 09/07/2011 10:38 PM, John Rose wrote: >>>Object l0, l1, l2, ...; >>>l0 = l1 = l2 = ... null; // these are required only for definite >>> assignment in the catch body >>>try { >>> ...do fast path... >>> if (...constraint violated...) throw new DeoptExc(...); >>> return ...fast result... >>>} catch (DeoptExc ex) { >>> Object[] display = { l0, l1, l2, ... }; >>> return backupInterpreter(ex, display); // N.B. could throw DeoptExc >>> to caller also >>>} >>> >>> This problem is that the eager initializations of the various locals slow >>> down the fast path. (Did I get it right Remi?) The register allocator >>> should push them down into the catch body, and maybe even into the static >>> debug info of the JVM. >> Locals is not the only problem, you have to track stack values too. > To clarify, I meant only JVM locals, which serve as virtual registers for the > application language locals, stacks, whatever. JVM stack elements are > guaranteed to be thrown away when a JVM exception is thrown, so they function > like virtual registers that are blown by deopt. events. > >> It's the combination of locals initialization, local variable creation >> for storing stack values > Yes, if there are logically stacked values that need tracking into slow > paths, they need to be spilled to JVM locals. The cost of this will be > reduced by good register allocations in the JIT. Ensuring this is part of > the tuning job. > >> and supplementary exception handlers >> (that's the main problem, or you have multiple handlers, >> or you have one handler and a constant) >> that slow down the fast path. > I think a single handler is going to be simpler all the way around. Thus, > the deopt. source location and value display map have to be stored somewhere > so they can be made a parameter to the handler. I think the exception object > is the most likely place to store it; TLS is also a candidate. currently I'm thinking to use multiple exception handler entry points but only one common code. handler1: iconst_0 goto common_handler handler2: iconst_1 goto common_handler ... common_handler: swap pop // remove exception object aload spill1 aload spill2 ... invokedynamic foo (ILObject;LObject ...) > > To extend on your cute idea below, have the source location and value display > map be static parameters to the invokedynamic instruction which initiates > transfer to the slow path. That way everything is in the class file, but > lazily unpacked only if needed. There's little or no need to use executable > bytecode instructions (other than an indy at the throw point and an indy in > the local handler) to manage the deopt. transition. The actual bytecodes can > be devoted to executing the fast path code, and testing for type-state faults. for PHP.reboot I need the AST node corresponding to the operation that overflow, so I need a live value. But I can use one parameter of the method to transfer all AST nodes wrapped in one object bound to the current method. > > BTW, although one might think that the repertoire for static BSM arguments is > limited (string, class, int, long, MH, MT), note that the MH gives a hook to > open-ended constant formation. Just treat the MH as a thunk to be executed > to materialize a desired constant. but you can get live value unless you allow to insert live values to the constant pool when linking the class (another old dream). > >> Also, in the catch block you can use invokedynamic to avoid >> the object array creation at call site. > Yes, cute idea. Simplifying code on the slow path should make for faster > loading and startup. > > Or (putting on my Pack200 hat), invokedynamic is the ultimate code-stream > compressor. > > -- John Rémi ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot
On Sep 8, 2011, at 4:57 AM, Thomas Wuerthinger wrote: > Why not the following code pattern? Does it generate too many bytecodes? That's a reasonable alternative. It generates data movement bytecodes O(L * M), where L is the average number of live values at deopt points and M is the number of deopt points. The quadratic exponent on bytecode size bothers me, at least a little. The other pattern pins the live values into a common set of locals, reducing data movement bytecodes (and probably compiled code data movement). Using Remi's trick of an invokedynamic instead of a varargs array, the number of data movement bytecodes can be cut down about 3x (no iconst/aastore). But it's still quadratic. -- John___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot
On 07.09.2011 22:38, John Rose wrote: > For example, at the Summit Remi pointed out an optimization problem > associated with this pattern: > > Object l0, l1, l2, ...; > l0 = l1 = l2 = ... null; // these are required only for definite > assignment in the catch body > try { > ...do fast path... > if (...constraint violated...) throw new DeoptExc(...); > return ...fast result... > } catch (DeoptExc ex) { > Object[] display = { l0, l1, l2, ... }; > return backupInterpreter(ex, display); // N.B. could throw > DeoptExc to caller also > } > > This problem is that the eager initializations of the various locals > slow down the fast path. (Did I get it right Remi?) The register > allocator should push them down into the catch body, and maybe even > into the static debug info of the JVM. Why not the following code pattern? Does it generate too many bytecodes? What about an exception that contains a *detailed* stack trace that includes expression stack and local variables? That might solve the issue of capturing the local variables altogether and provide a natural way of expressing customized deoptimization. Object l0, l1, l2, ...; try { ...do fast path... if (...constraint violated...) { Object[] display = { l0, l1, l2, null, ... }; throw new DeoptExc(display); } return ...fast result... } catch (DeoptExc ex) { return backupInterpreter(ex, ex.display); // N.B. could throw DeoptExc to caller also } So with the *detailed* stack trace it would read: Object l0, l1, l2, ...; try { ...do fast path... if (...constraint violated...) throw new DeoptExc(); return ...fast result... } catch (DeoptExc ex) { return backupInterpreter(ex, ex.getLocalsAndStack()); // N.B. could throw DeoptExc to caller also } That could also work nicely with "addWithoutOverflow(a, b) throws DeoptExc". In the optimized code the catch block and the throw-statement are both not present (based on the branch prediction analysis). So in case of overflow the VM would first do a Java-level deoptimization. Then the interpreter would throw the DeoptExc, which will then lead to the scripting-level deoptimization. - thomas ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot
* John Rose [2011-09-06 20:04] writes: > On Sep 6, 2011, at 8:51 AM, Charles Oliver Nutter wrote: > > Did we ever figure out if it's possible to trick Hotspot into doing a > JO instead of the raw bit-level operations? John/Christian/Tom: what > would it take to get HS to "know" that we're doing an integer > overflow-after-maths check and do the (faster?) JO? > > (1) Write a compelling API for something like Integer.addDetectingOverflow. > (2) Roll it into JDK 8+epsilon. > (3) Do the JIT work. > > People have thought on and off about (1) for many years, but with no clear > winner. Exceptions or boxed objects have unpleasant interactions and are > hard to use, while smuggling out the 33rd bit some other way (TLS, a long or > double, a return-by-reference, a sentinel value) is painful. > > (This is a case where tuples would make things simple, but it is not enough > to motivate introducing tuples.) Your comment seems to imply that tuples or multiple return values will not be available anytime soon. Have you some information on what will and will not be in JDK8? Helmut ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot
It would be really nice if you could have a class that was something like Double with overflow check initially and then when you detect an overflow substitute in something like BigDecimal instead, i.e. hot swapping of the object's class. This saves having to have a class with two fields and checking one field for null in each method. I believe hot-swapping of classes has been considered as a standard JVM addition and that there are some JVMs, particularly in debug mode, that can do this. Even a very limited form of hot swapping would be useful, you could say that the class must have exactly the same number of instance fields and these must have the same length or be padded and that it must have exactly the same number of virtual methods. Note that double, long, and pointer on many JVMs are 64 bits and therefore even with the limitation of same length you could do something useful (transitioning a number from int through double to arbitrary precision). -- Howard. On 8 September 2011 07:25, Charles Oliver Nutter wrote: > On Wed, Sep 7, 2011 at 2:00 AM, Per Bothner wrote: > > Kawa's gnu.math.IntNum already does this. It has only two fields: > > Yeah, I think I remember you mentioning this in one of the > arbitrary-precision math threads on JVM-L. I assume you could use an > intrinsic optimization for overflow checks too, yes? > > - Charlie > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > -- -- Howard. ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot
On Sep 7, 2011, at 3:28 PM, Rémi Forax wrote: > On 09/07/2011 10:38 PM, John Rose wrote: >> >> Object l0, l1, l2, ...; >> l0 = l1 = l2 = ... null; // these are required only for definite >> assignment in the catch body >> try { >> ...do fast path... >> if (...constraint violated...) throw new DeoptExc(...); >> return ...fast result... >> } catch (DeoptExc ex) { >> Object[] display = { l0, l1, l2, ... }; >> return backupInterpreter(ex, display); // N.B. could throw DeoptExc to >> caller also >> } >> >> This problem is that the eager initializations of the various locals slow >> down the fast path. (Did I get it right Remi?) The register allocator >> should push them down into the catch body, and maybe even into the static >> debug info of the JVM. > > Locals is not the only problem, you have to track stack values too. To clarify, I meant only JVM locals, which serve as virtual registers for the application language locals, stacks, whatever. JVM stack elements are guaranteed to be thrown away when a JVM exception is thrown, so they function like virtual registers that are blown by deopt. events. > It's the combination of locals initialization, local variable creation > for storing stack values Yes, if there are logically stacked values that need tracking into slow paths, they need to be spilled to JVM locals. The cost of this will be reduced by good register allocations in the JIT. Ensuring this is part of the tuning job. > and supplementary exception handlers > (that's the main problem, or you have multiple handlers, > or you have one handler and a constant) > that slow down the fast path. I think a single handler is going to be simpler all the way around. Thus, the deopt. source location and value display map have to be stored somewhere so they can be made a parameter to the handler. I think the exception object is the most likely place to store it; TLS is also a candidate. To extend on your cute idea below, have the source location and value display map be static parameters to the invokedynamic instruction which initiates transfer to the slow path. That way everything is in the class file, but lazily unpacked only if needed. There's little or no need to use executable bytecode instructions (other than an indy at the throw point and an indy in the local handler) to manage the deopt. transition. The actual bytecodes can be devoted to executing the fast path code, and testing for type-state faults. BTW, although one might think that the repertoire for static BSM arguments is limited (string, class, int, long, MH, MT), note that the MH gives a hook to open-ended constant formation. Just treat the MH as a thunk to be executed to materialize a desired constant. > Also, in the catch block you can use invokedynamic to avoid > the object array creation at call site. Yes, cute idea. Simplifying code on the slow path should make for faster loading and startup. Or (putting on my Pack200 hat), invokedynamic is the ultimate code-stream compressor. -- John ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot
On 09/07/2011 03:02 PM, Rémi Forax wrote: > On 09/07/2011 09:32 PM, Per Bothner wrote: >> On 09/07/2011 02:15 AM, Rémi Forax wrote: >>> What about having 10 to 12 benchs, one by language, provided by each >>> language runtime developer >>> as a good bench for their runtime ? >> Well, there are the Computer Language Benchmark Game problems at >> http://shootout.alioth.debian.org/ . > ... > The shootout benchmark compares languages/runtimes, > I want to compare JVMs or versions of JVMs running > idiomatic code of each dynamic languages. But why can't the shootout benchmark programs also be useful for testing and comparing JVMs? -- --Per Bothner p...@bothner.com http://per.bothner.com/ ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot
On 09/07/2011 02:25 PM, Charles Oliver Nutter wrote: > On Wed, Sep 7, 2011 at 2:00 AM, Per Bothner wrote: >> Kawa's gnu.math.IntNum already does this. It has only two fields: > > Yeah, I think I remember you mentioning this in one of the > arbitrary-precision math threads on JVM-L. I assume you could use an > intrinsic optimization for overflow checks too, yes? That seem likely. Right now the code path is a bit complex, but in the case of adding two array-less IntNum we convert to long, and do a long addition. Then we check if the result is in the range [minFixNum..maxFixNum] of pre-allocated IntNums. Then we check if result==(int)result - i.e. if we got an actual overflow. So in addition to a fast overflow check it would be helpful to have a fast test that the result is in range of the preallocated "fixnums". If we had a cheap overflow test then I'd presumably change the code patch to use int arithmetic (instead of long), but it might still be too much to inline. I think a fast overflow check is most helpful when you want to detect overflow and throw an exception; it's less of a win when you also need to allocate a BigInteger/IntNum. -- --Per Bothner p...@bothner.com http://per.bothner.com/ ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot
On 09/07/2011 10:38 PM, John Rose wrote: On Sep 7, 2011, at 12:33 PM, Thomas Wuerthinger wrote: This would probably also mean that the exception object created for capturing the slow-case program state needs to be escape-analyzed and removed in the optimized code that deoptimizes on overflow? Yes. And in the case of user-defined deoptimization, the exception object would encode the continuation required by the source language. (A continuation would be something like a source location or an AST node, plus a map from local values to source variables. The local exception handler would package up the live local values into a display that could be loaded into the backup interpreter.) This exception object would be live only on the exception path, and so (as you say) could be built lazily from the JVM's debug info for EA. As we discussed at the JVM Language Summit, building user-level deoptimization on top of JVM-level deoptimization should allow non-Java languages to benefit from the same partial-compilation tactics that Java enjoys. The nicest part is that the basic components are in place; we just need to settle on effective patterns and tune up the associated JVM optimizations. For example, at the Summit Remi pointed out an optimization problem associated with this pattern: Object l0, l1, l2, ...; l0 = l1 = l2 = ... null; // these are required only for definite assignment in the catch body try { ...do fast path... if (...constraint violated...) throw new DeoptExc(...); return ...fast result... } catch (DeoptExc ex) { Object[] display = { l0, l1, l2, ... }; return backupInterpreter(ex, display); // N.B. could throw DeoptExc to caller also } This problem is that the eager initializations of the various locals slow down the fast path. (Did I get it right Remi?) The register allocator should push them down into the catch body, and maybe even into the static debug info of the JVM. Locals is not the only problem, you have to track stack values too. It's the combination of locals initialization, local variable creation for storing stack values and supplementary exception handlers (that's the main problem, or you have multiple handlers, or you have one handler and a constant) that slow down the fast path. Also, in the catch block you can use invokedynamic to avoid the object array creation at call site. If we had a benchmark that demonstrated the problems with such an approach, we could file a bug to tune the optimizer for it. The ideal benchmark would not actually run the deoptimization path (except to demonstrate correctness) but would measure the speed of the fast path, and the impact of the deopt support on the fast path. Integer overflow is an obvious candidate for a rare deopt condition, and so would a quasi-constant function binding (via a MutableCallsite). A tak or fib benchmark could demonstrate both. Ok, let's try with fib. -- John Rémi ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot
On 09/07/2011 09:32 PM, Per Bothner wrote: > On 09/07/2011 02:15 AM, Rémi Forax wrote: >> This remember me that we don't have any benchmarks using dynamic languages >> which is, as you explain, not good on the long term. >> >> What about having 10 to 12 benchs, one by language, provided by each >> language runtime developer >> as a good bench for their runtime ? > Well, there are the Computer Language Benchmark Game problems at > http://shootout.alioth.debian.org/ . Isaac Gouy doesn't want to support > more languages officially, but we can certainly use these as a starting > point on some alternative server. > > (There are fast Kawa versions of all the current benchmarks: > http://per.bothner.com/blog/2010/Kawa-in-shootout/ ) The shootout benchmark compares languages/runtimes, I want to compare JVMs or versions of JVMs running idiomatic code of each dynamic languages. Rémi ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot
On Wed, Sep 7, 2011 at 2:00 AM, Per Bothner wrote: > Kawa's gnu.math.IntNum already does this. It has only two fields: Yeah, I think I remember you mentioning this in one of the arbitrary-precision math threads on JVM-L. I assume you could use an intrinsic optimization for overflow checks too, yes? - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot
On Sep 7, 2011, at 12:33 PM, Thomas Wuerthinger wrote: > This would probably also mean that the exception object created for capturing > the slow-case program state needs to be escape-analyzed and removed in the > optimized code that deoptimizes on overflow? Yes. And in the case of user-defined deoptimization, the exception object would encode the continuation required by the source language. (A continuation would be something like a source location or an AST node, plus a map from local values to source variables. The local exception handler would package up the live local values into a display that could be loaded into the backup interpreter.) This exception object would be live only on the exception path, and so (as you say) could be built lazily from the JVM's debug info for EA. As we discussed at the JVM Language Summit, building user-level deoptimization on top of JVM-level deoptimization should allow non-Java languages to benefit from the same partial-compilation tactics that Java enjoys. The nicest part is that the basic components are in place; we just need to settle on effective patterns and tune up the associated JVM optimizations. For example, at the Summit Remi pointed out an optimization problem associated with this pattern: Object l0, l1, l2, ...; l0 = l1 = l2 = ... null; // these are required only for definite assignment in the catch body try { ...do fast path... if (...constraint violated...) throw new DeoptExc(...); return ...fast result... } catch (DeoptExc ex) { Object[] display = { l0, l1, l2, ... }; return backupInterpreter(ex, display); // N.B. could throw DeoptExc to caller also } This problem is that the eager initializations of the various locals slow down the fast path. (Did I get it right Remi?) The register allocator should push them down into the catch body, and maybe even into the static debug info of the JVM. If we had a benchmark that demonstrated the problems with such an approach, we could file a bug to tune the optimizer for it. The ideal benchmark would not actually run the deoptimization path (except to demonstrate correctness) but would measure the speed of the fast path, and the impact of the deopt support on the fast path. Integer overflow is an obvious candidate for a rare deopt condition, and so would a quasi-constant function binding (via a MutableCallsite). A tak or fib benchmark could demonstrate both. -- John___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot
Rémi Forax wrote: > > so the VM knows that only methods *DetectingOverflow are able to > throw that specific exception. > int addDetectingOverflow(int x, int y) throws IntegerOverflowException > This also have the advantage that the inlining heuristic can be tweaked > to not count exception handlers that receive that specific exception. > > Generalize this to "UnexpectedTypeStateException" and build that > into a set of patterns that accomplish user-defined deoptimization. > That would be worth special-casing in the JVM. > > -- John___ Would not s specialized GWT be a nice way to handle this? Especially one where the developer could provide a test in an inline assembly like style for slot access, unboxing and compares Seems to me John mentioned an idea for a methodHandle to inline code as in interesting concept mark ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot
On 09/07/2011 02:15 AM, Rémi Forax wrote: > This remember me that we don't have any benchmarks using dynamic languages > which is, as you explain, not good on the long term. > > What about having 10 to 12 benchs, one by language, provided by each > language runtime developer > as a good bench for their runtime ? Well, there are the Computer Language Benchmark Game problems at http://shootout.alioth.debian.org/ . Isaac Gouy doesn't want to support more languages officially, but we can certainly use these as a starting point on some alternative server. (There are fast Kawa versions of all the current benchmarks: http://per.bothner.com/blog/2010/Kawa-in-shootout/ ) -- --Per Bothner p...@bothner.com http://per.bothner.com/ ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot
On 9/7/11 1:10 AM, John Rose wrote: That's true, except that exceptions tend to be imprecise: It's hard to tell which sub-expression cause the exception, out of a complex statement. Addressing both the precision and pre-allocation problems, you could ask the application to create the exception: public static int addDetectingOverflow(int x, int y, X onOverflow) throws X This would probably also mean that the exception object created for capturing the slow-case program state needs to be escape-analyzed and removed in the optimized code that deoptimizes on overflow? - thomas ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot
On Sep 7, 2011, at 7:24 AM, Rémi Forax wrote: > so the VM knows that only methods *DetectingOverflow are able to throw that > specific exception. > int addDetectingOverflow(int x, int y) throws IntegerOverflowException > This also have the advantage that the inlining heuristic can be tweaked > to not count exception handlers that receive that specific exception. Generalize this to "UnexpectedTypeStateException" and build that into a set of patterns that accomplish user-defined deoptimization. That would be worth special-casing in the JVM. -- John___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot
On 09/07/2011 05:07 AM, Charles Oliver Nutter wrote: On Tue, Sep 6, 2011 at 6:10 PM, John Rose wrote: > Yes. Your request for "JO" makes me think some users would be happy with a > boolean test, a la addWouldOverflow. > It's what happens after the test that differs widely among applications, so > why not just standardize the test. > if (addWouldOverflow(p, q)) { throw or return BigInt or ... } > return new Integer(p + q); > The p+q, if it occurs within addWouldOverflow(p, q), will value-number to > the explicit p+q, allowing the expected assembly code which computes p+q and > then checks the overflow bit. > (Actually, it's likely that the "addl p',q" instruction would occur twice, > because hotspot not very good at tracking condition codes.) That was my immediate concern. JO will act based on the last operation, so we wouldn't duplicate any work. Of course, at the level of multiple addl's it may be a small price to pay for a less code-order-sensitive option like addWouldOverflow. Thinking about how you'd JIT with such intrinsics made me realize the best case is still the full-on "addDetectingOverflow" since it could emit the add and jo operations all together in the proper order. Anything that depends on the bytecode ordering (iadd followed by this intrinsic call) would be tweaky, and then there's the simple fact that in the*absence* of JIT there's no real way to do "didAddOverflow" without passing everything in again like we do in JRuby now. Perhaps no gain in that case. Only the full "addDetectingOverflow" could reliable do the add and jo in precisely the correct order, figuring in any other register effects. > That's true, except that exceptions tend to be imprecise: It's hard to tell > which sub-expression cause the exception, out of a complex statement. > Addressing both the precision and pre-allocation problems, you could ask the > application to create the exception: > public static > int addDetectingOverflow(int x, int y, X onOverflow) throws X This is pretty good, though it's another unusual precedent for JDK (or at least I know of no APIs that have this form). Still, it might be the lightest-weight option, since it allows you to opt completely out of all allocation. The other solution is to do the strict opposite, to use an exception that have a private constructor so it can't be created by a user code and have no stacktrace, etc (see http://download.oracle.com/javase/7/docs/api/java/lang/Throwable.html#Throwable%28java.lang.String,%20java.lang.Throwable,%20boolean,%20boolean%29) so the VM knows that only methods *DetectingOverflow are able to throw that specific exception. int addDetectingOverflow(int x, int y) throws IntegerOverflowException This also have the advantage that the inlining heuristic can be tweaked to not count exception handlers that receive that specific exception. Rémi ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot
On Sep 7, 2011, at 4:13 PM, Rémi Forax wrote: > On 09/07/2011 01:22 PM, Christian Thalinger wrote: >> >> >> On Sep 7, 2011, at 11:15 AM, Rémi Forax wrote: >> >>> On 09/07/2011 09:08 AM, John Rose wrote: On Sep 7, 2011, at 12:00 AM, Per Bothner wrote: > I assume this is one reason why Kawa's IntNum is (mostly) faster than > BigInteger. Yes, that's probably true. Here's a dirty secret: As you can see from the OpenJDK sources, BigDecimal, but not BigInteger, has this optimization (see private field BigDecimal.intCompact). Why BigDecimal but not BigInteger? Because specjbb2000 uses BigDecimal. (OTOH, an imperfect metric like specjbb2000 is far better than no metric at all, for driving competition.) >>> >>> This remember me that we don't have any benchmarks using dynamic languages >>> which is, as you explain, not good on the long term. >>> >>> What about having 10 to 12 benchs, one by language, provided by each >>> language runtime developer >>> as a good bench for their runtime ? >> >> That is a VERY good idea! How about if PHP.reboot contributes the first >> one? ;-) > > I've to think a bit about what is the best program for a bench of PHP.reboot. > I also think that we should provide only one jar with a special entrypoint. Right, that would be good. Maybe something like DaCapo where you have one jar file but you can also choose to run single benchmarks (e.g. java -jar jsr292bench.jar [phpreboot|jruby|seph|...]). > > What is the best location for that benchmark ? Not sure. What about Kenai? -- Christian > >> >> -- Christian > > Rémi > > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot
On 09/07/2011 01:22 PM, Christian Thalinger wrote: On Sep 7, 2011, at 11:15 AM, Rémi Forax wrote: On 09/07/2011 09:08 AM, John Rose wrote: On Sep 7, 2011, at 12:00 AM, Per Bothner wrote: I assume this is one reason why Kawa's IntNum is (mostly) faster than BigInteger. Yes, that's probably true. Here's a dirty secret: As you can see from the OpenJDK sources, BigDecimal, but not BigInteger, has this optimization (see private field BigDecimal.intCompact). Why BigDecimal but not BigInteger? Because specjbb2000 uses BigDecimal. (OTOH, an imperfect metric like specjbb2000 is far better than no metric at all, for driving competition.) This remember me that we don't have any benchmarks using dynamic languages which is, as you explain, not good on the long term. What about having 10 to 12 benchs, one by language, provided by each language runtime developer as a good bench for their runtime ? That is a VERY good idea! How about if PHP.reboot contributes the first one? ;-) I've to think a bit about what is the best program for a bench of PHP.reboot. I also think that we should provide only one jar with a special entrypoint. What is the best location for that benchmark ? -- Christian Rémi ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot
On 09/07/2011 01:00 PM, MacGregor, Duncan (GE Energy) wrote: Could we do pass a method handle into this hypothetical to this hypothetical addDetectingOverflow and allow thus allow the caller to specify what should happen in the overflow case? Or does that still leave too much of a problem regarding actually returning the values? The return value is one problem because what you need to provide is not the return value of addDetectingOverflow but to the method that inlines (perhaps not directly) addDetectingOverflow. The other problem is that the VM will have to gather all locals to pass them as argument of the method handle. It's simpler to jump to a code that will call the interpreter (or another compiled code as PyPy does) hence the use of an exception. Rémi *From:*mlvm-dev-boun...@openjdk.java.net [mailto:mlvm-dev-boun...@openjdk.java.net] *On Behalf Of *John Rose *Sent:* 06 September 2011 21:05 *To:* Da Vinci Machine Project *Subject:* Re: Hotspot loves PHP.reboot On Sep 6, 2011, at 8:51 AM, Charles Oliver Nutter wrote: Did we ever figure out if it's possible to trick Hotspot into doing a JO instead of the raw bit-level operations? John/Christian/Tom: what would it take to get HS to "know" that we're doing an integer overflow-after-maths check and do the (faster?) JO? (1) Write a compelling API for something like Integer.addDetectingOverflow. (2) Roll it into JDK 8+epsilon. (3) Do the JIT work. People have thought on and off about (1) for many years, but with no clear winner. Exceptions or boxed objects have unpleasant interactions and are hard to use, while smuggling out the 33rd bit some other way (TLS, a long or double, a return-by-reference, a sentinel value) is painful. (This is a case where tuples would make things simple, but it is not enough to motivate introducing tuples.) -- John ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot
On Sep 7, 2011, at 11:15 AM, Rémi Forax wrote: > On 09/07/2011 09:08 AM, John Rose wrote: >> >> On Sep 7, 2011, at 12:00 AM, Per Bothner wrote: >> >>> I assume this is one reason why Kawa's IntNum is (mostly) faster than >>> BigInteger. >> >> Yes, that's probably true. >> >> Here's a dirty secret: As you can see from the OpenJDK sources, BigDecimal, >> but not BigInteger, has this optimization (see private field >> BigDecimal.intCompact). Why BigDecimal but not BigInteger? Because >> specjbb2000 uses BigDecimal. >> >> (OTOH, an imperfect metric like specjbb2000 is far better than no metric at >> all, for driving competition.) > > This remember me that we don't have any benchmarks using dynamic languages > which is, as you explain, not good on the long term. > > What about having 10 to 12 benchs, one by language, provided by each language > runtime developer > as a good bench for their runtime ? That is a VERY good idea! How about if PHP.reboot contributes the first one? ;-) -- Christian > >> >> -- John > > Rémi > > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
RE: Hotspot loves PHP.reboot
Could we do pass a method handle into this hypothetical to this hypothetical addDetectingOverflow and allow thus allow the caller to specify what should happen in the overflow case? Or does that still leave too much of a problem regarding actually returning the values? From: mlvm-dev-boun...@openjdk.java.net [mailto:mlvm-dev-boun...@openjdk.java.net] On Behalf Of John Rose Sent: 06 September 2011 21:05 To: Da Vinci Machine Project Subject: Re: Hotspot loves PHP.reboot On Sep 6, 2011, at 8:51 AM, Charles Oliver Nutter wrote: Did we ever figure out if it's possible to trick Hotspot into doing a JO instead of the raw bit-level operations? John/Christian/Tom: what would it take to get HS to "know" that we're doing an integer overflow-after-maths check and do the (faster?) JO? (1) Write a compelling API for something like Integer.addDetectingOverflow. (2) Roll it into JDK 8+epsilon. (3) Do the JIT work. People have thought on and off about (1) for many years, but with no clear winner. Exceptions or boxed objects have unpleasant interactions and are hard to use, while smuggling out the 33rd bit some other way (TLS, a long or double, a return-by-reference, a sentinel value) is painful. (This is a case where tuples would make things simple, but it is not enough to motivate introducing tuples.) -- John ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot
On 09/07/2011 09:08 AM, John Rose wrote: On Sep 7, 2011, at 12:00 AM, Per Bothner wrote: I assume this is one reason why Kawa's IntNum is (mostly) faster than BigInteger. Yes, that's probably true. Here's a dirty secret: As you can see from the OpenJDK sources, BigDecimal, but not BigInteger, has this optimization (see private field BigDecimal.intCompact). Why BigDecimal but not BigInteger? Because specjbb2000 uses BigDecimal. (OTOH, an imperfect metric like specjbb2000 is far better than no metric at all, for driving competition.) This remember me that we don't have any benchmarks using dynamic languages which is, as you explain, not good on the long term. What about having 10 to 12 benchs, one by language, provided by each language runtime developer as a good bench for their runtime ? -- John Rémi ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot
On Sep 7, 2011, at 12:00 AM, Per Bothner wrote: > I assume this is one reason why Kawa's IntNum is (mostly) faster than > BigInteger. Yes, that's probably true. Here's a dirty secret: As you can see from the OpenJDK sources, BigDecimal, but not BigInteger, has this optimization (see private field BigDecimal.intCompact). Why BigDecimal but not BigInteger? Because specjbb2000 uses BigDecimal. (OTOH, an imperfect metric like specjbb2000 is far better than no metric at all, for driving competition.) -- John___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot
On 09/06/2011 08:07 PM, Charles Oliver Nutter wrote: > I also thought of an existing API that would benefit from this, and > perhaps there's a path to getting something in JDK 7 (unofficially) > and JDK 8 (officially): BigInteger. Ideally BigInteger should only use > a primitive long (or int?) up to its limits, and not allocate an array > until it exceeds those limits. Such an implementation would need to do > the same overflow checks JRuby does, and could benefit from > addDetectingOverflow. And we know there's constant cries for > BigInteger and BigDecimal perf to be improved...so I'd say every bit > helps. Kawa's gnu.math.IntNum already does this. It has only two fields: /** All integers are stored in 2's-complement form. * If words == null, the ival is the value of this IntNum. * Otherwise, the first ival elements of words make the value * of this IntNum, stored in little-endian order, 2's-complement form. */ public int ival; public int[] words; I assume this is one reason why Kawa's IntNum is (mostly) faster than BigInteger. -- --Per Bothner p...@bothner.com http://per.bothner.com/ ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot
On Tue, Sep 6, 2011 at 6:10 PM, John Rose wrote: > Yes. Your request for "JO" makes me think some users would be happy with a > boolean test, a la addWouldOverflow. > It's what happens after the test that differs widely among applications, so > why not just standardize the test. > if (addWouldOverflow(p, q)) { throw or return BigInt or ... } > return new Integer(p + q); > The p+q, if it occurs within addWouldOverflow(p, q), will value-number to > the explicit p+q, allowing the expected assembly code which computes p+q and > then checks the overflow bit. > (Actually, it's likely that the "addl p',q" instruction would occur twice, > because hotspot not very good at tracking condition codes.) That was my immediate concern. JO will act based on the last operation, so we wouldn't duplicate any work. Of course, at the level of multiple addl's it may be a small price to pay for a less code-order-sensitive option like addWouldOverflow. Thinking about how you'd JIT with such intrinsics made me realize the best case is still the full-on "addDetectingOverflow" since it could emit the add and jo operations all together in the proper order. Anything that depends on the bytecode ordering (iadd followed by this intrinsic call) would be tweaky, and then there's the simple fact that in the *absence* of JIT there's no real way to do "didAddOverflow" without passing everything in again like we do in JRuby now. Perhaps no gain in that case. Only the full "addDetectingOverflow" could reliable do the add and jo in precisely the correct order, figuring in any other register effects. > That's true, except that exceptions tend to be imprecise: It's hard to tell > which sub-expression cause the exception, out of a complex statement. > Addressing both the precision and pre-allocation problems, you could ask the > application to create the exception: > public static > int addDetectingOverflow(int x, int y, X onOverflow) throws X This is pretty good, though it's another unusual precedent for JDK (or at least I know of no APIs that have this form). Still, it might be the lightest-weight option, since it allows you to opt completely out of all allocation. I also thought of an existing API that would benefit from this, and perhaps there's a path to getting something in JDK 7 (unofficially) and JDK 8 (officially): BigInteger. Ideally BigInteger should only use a primitive long (or int?) up to its limits, and not allocate an array until it exceeds those limits. Such an implementation would need to do the same overflow checks JRuby does, and could benefit from addDetectingOverflow. And we know there's constant cries for BigInteger and BigDecimal perf to be improved...so I'd say every bit helps. - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot
On Sep 6, 2011, at 1:18 PM, Charles Oliver Nutter wrote: > On Tue, Sep 6, 2011 at 3:04 PM, John Rose wrote: >> (1) Write a compelling API for something like Integer.addDetectingOverflow. >> (2) Roll it into JDK 8+epsilon. >> (3) Do the JIT work. >> People have thought on and off about (1) for many years, but with no clear >> winner. Exceptions or boxed objects have unpleasant interactions and are >> hard to use, while smuggling out the 33rd bit some other way (TLS, a long or >> double, a return-by-reference, a sentinel value) is painful. >> (This is a case where tuples would make things simple, but it is not enough >> to motivate introducing tuples.) > > Not that it matters, but JRuby's integer math is all 64-bit...so we > want a 65th bit of data out of maths :) Most of the tricks apply to both 32- and 64-bit ints. > ...that's a fun little > API design problem to solve. Hmm. Yes. Your request for "JO" makes me think some users would be happy with a boolean test, a la addWouldOverflow. It's what happens after the test that differs widely among applications, so why not just standardize the test. if (addWouldOverflow(p, q)) { throw or return BigInt or ... } return new Integer(p + q); The p+q, if it occurs within addWouldOverflow(p, q), will value-number to the explicit p+q, allowing the expected assembly code which computes p+q and then checks the overflow bit. (Actually, it's likely that the "addl p',q" instruction would occur twice, because hotspot not very good at tracking condition codes.) On Sep 6, 2011, at 1:36 PM, Rémi Forax wrote: > An exception is perhaps more easier to use, > because if it overflow you may have to deoptimize, for that you need the > stack and local values, > it's easier to jump to a exception handler that will push all these values > and call the interpreter. That's true, except that exceptions tend to be imprecise: It's hard to tell which sub-expression cause the exception, out of a complex statement. Addressing both the precision and pre-allocation problems, you could ask the application to create the exception: public static int addDetectingOverflow(int x, int y, X onOverflow) throws X -- John___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot
On Tue, Sep 6, 2011 at 3:36 PM, Rémi Forax wrote: > An exception is perhaps more easier to use, > because if it overflow you may have to deoptimize, for that you need the > stack and local values, > it's easier to jump to a exception handler that will push all these values > and call the interpreter. I tend to agree. An exception would be cleanest. However, an exception must be caught...so you have the additional try/catch logic affecting optimization, no? Of course perf nuts would also want to pre-allocate that exception, since it's really flow control. Dunno if there's any precedent for that in the JVM other than OutOfMemoryError. - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot
On 09/06/2011 10:19 PM, John Rose wrote: On Sep 6, 2011, at 12:58 PM, John Rose wrote: What's needed here is a way to get 33 bits out of a 32-bit add intrinsic. There's no fully natural way to do this, and about 4 kludgey ways. Because there are so many poor ways to shape the API, it's hard to pick the best one to invest in. But, assuming the user wants to force a JO, here's one fairly clean way to do it: /** * Detect 32-bit overflow if the parameters are summed. * @return true if the inputs have the same sign, but their 32-bit sum would have a different sign */ public static boolean addWouldOverflow(int x, int y) { //int res = x + y; //boolean overflowDetected = (SGN(x) == SGN(y) && SGN(x) != SGN(res)); //boolean overflowDetected = ((x ^ y) >= 0 & (x ^ res) < 0); //boolean overflowDetected = ((x ^ y ^ -1) & (x ^ res)) < 0; return (((x ^ y ^ -1) & (x ^ (x+y))) < 0); } That would provide a fairly stable and clear target for the JIT to aim at. No points for ease-of-use. An exception is perhaps more easier to use, because if it overflow you may have to deoptimize, for that you need the stack and local values, it's easier to jump to a exception handler that will push all these values and call the interpreter. -- John Rémi ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot
On Sep 6, 2011, at 12:58 PM, John Rose wrote: > What's needed here is a way to get 33 bits out of a 32-bit add intrinsic. > There's no fully natural way to do this, and about 4 kludgey ways. Because > there are so many poor ways to shape the API, it's hard to pick the best one > to invest in. But, assuming the user wants to force a JO, here's one fairly clean way to do it: /** * Detect 32-bit overflow if the parameters are summed. * @return true if the inputs have the same sign, but their 32-bit sum would have a different sign */ public static boolean addWouldOverflow(int x, int y) { //int res = x + y; //boolean overflowDetected = (SGN(x) == SGN(y) && SGN(x) != SGN(res)); //boolean overflowDetected = ((x ^ y) >= 0 & (x ^ res) < 0); //boolean overflowDetected = ((x ^ y ^ -1) & (x ^ res)) < 0; return (((x ^ y ^ -1) & (x ^ (x+y))) < 0); } That would provide a fairly stable and clear target for the JIT to aim at. No points for ease-of-use. -- John___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot
On Tue, Sep 6, 2011 at 3:04 PM, John Rose wrote: > (1) Write a compelling API for something like Integer.addDetectingOverflow. > (2) Roll it into JDK 8+epsilon. > (3) Do the JIT work. > People have thought on and off about (1) for many years, but with no clear > winner. Exceptions or boxed objects have unpleasant interactions and are > hard to use, while smuggling out the 33rd bit some other way (TLS, a long or > double, a return-by-reference, a sentinel value) is painful. > (This is a case where tuples would make things simple, but it is not enough > to motivate introducing tuples.) Not that it matters, but JRuby's integer math is all 64-bit...so we want a 65th bit of data out of maths :) A couple observations: * I think I speak for the entire JVM dynlang community when I say that we'd be happy to smuggle features into Java 6 via unsupported packages. In other words, if there were a sun.misc intrinsic we could link against, we'd happily do it when running on Hotspot and move to the formal way in JDK 8. * I was about to suggest a closure-based approach to handling the overflow (a callback, basically) but then realized constructing the closure instance has a nontrivial cost too. Yep, that's a fun little API design problem to solve. Hmm. - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot
On Sep 6, 2011, at 8:51 AM, Charles Oliver Nutter wrote: > Did we ever figure out if it's possible to trick Hotspot into doing a > JO instead of the raw bit-level operations? John/Christian/Tom: what > would it take to get HS to "know" that we're doing an integer > overflow-after-maths check and do the (faster?) JO? (1) Write a compelling API for something like Integer.addDetectingOverflow. (2) Roll it into JDK 8+epsilon. (3) Do the JIT work. People have thought on and off about (1) for many years, but with no clear winner. Exceptions or boxed objects have unpleasant interactions and are hard to use, while smuggling out the 33rd bit some other way (TLS, a long or double, a return-by-reference, a sentinel value) is painful. (This is a case where tuples would make things simple, but it is not enough to motivate introducing tuples.) -- John___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot
On Sep 6, 2011, at 11:28 AM, Charles Oliver Nutter wrote: > On Tue, Sep 6, 2011 at 12:36 PM, Rémi Forax wrote: >> If you have specialize for -2/+2, you should reuse exactly the same code >> for +n/-n. >> see >> https://code.google.com/p/jsr292-cookbook/source/browse/trunk/binary-operation/src/jsr292/cookbook/binop/RT.java#11 > > You're right. I'll make that change. Thanks! Also (from an early attempt at a cookbook): http://hg.openjdk.java.net/mlvm/mlvm/file/tip/netbeans/indy-demo/src/recipes/FastAndSlow.java This uses the same logic as Remi's safeAdd. What's needed here is a way to get 33 bits out of a 32-bit add intrinsic. There's no fully natural way to do this, and about 4 kludgey ways. Because there are so many poor ways to shape the API, it's hard to pick the best one to invest in. -- John ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot
On Tue, Sep 6, 2011 at 12:36 PM, Rémi Forax wrote: > If you have specialize for -2/+2, you should reuse exactly the same code > for +n/-n. > see > https://code.google.com/p/jsr292-cookbook/source/browse/trunk/binary-operation/src/jsr292/cookbook/binop/RT.java#11 You're right. I'll make that change. Thanks! - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot
On Tue, Sep 6, 2011 at 12:33 PM, Christian Thalinger wrote: > We already talked a bit about that some while ago. I think matching that > double-xor-trick (or whatever it's called) is too risky. A JDK method that > does the check (and the math?) would be nice so we can intrinsify it. GWT > would fit here. Yeah, talking with Kris Mok over Twitter we both agreed that an intrinsic would be preferable. Prescribing a specific code shape is fragile and dangerous if someone actually wants the xors in place (a JO instruction would not be 100% equivalent, obviously). > Btw. what means JO? Jump if Overflow. - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot
On 09/06/2011 05:51 PM, Charles Oliver Nutter wrote: > On Tue, Sep 6, 2011 at 10:39 AM, Rémi Forax wrote: >> Yes, but don't forget that PHP.reboot has no overflow check. > Did we ever figure out if it's possible to trick Hotspot into doing a > JO instead of the raw bit-level operations? John/Christian/Tom: what > would it take to get HS to "know" that we're doing an integer > overflow-after-maths check and do the (faster?) JO? > > Rémi: For what it's worth, I did specialize +/- for 1 and 2 in the > indy-based call protocols to avoid a full overflow check. If you have specialize for -2/+2, you should reuse exactly the same code for +n/-n. see https://code.google.com/p/jsr292-cookbook/source/browse/trunk/binary-operation/src/jsr292/cookbook/binop/RT.java#11 > > - Charlie Rémi ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot
On 09/06/2011 07:33 PM, Christian Thalinger wrote: > On Sep 6, 2011, at 5:51 PM, Charles Oliver Nutter wrote: > >> On Tue, Sep 6, 2011 at 10:39 AM, Rémi Forax wrote: >>> Yes, but don't forget that PHP.reboot has no overflow check. >> Did we ever figure out if it's possible to trick Hotspot into doing a >> JO instead of the raw bit-level operations? John/Christian/Tom: what >> would it take to get HS to "know" that we're doing an integer >> overflow-after-maths check and do the (faster?) JO? > We already talked a bit about that some while ago. I think matching that > double-xor-trick (or whatever it's called) is too risky. A JDK method that > does the check (and the math?) would be nice so we can intrinsify it. GWT > would fit here. > > Btw. what means JO? jump on overflow, i guess. > > -- Christian Rémi ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot
On Sep 6, 2011, at 5:51 PM, Charles Oliver Nutter wrote: > On Tue, Sep 6, 2011 at 10:39 AM, Rémi Forax wrote: >> Yes, but don't forget that PHP.reboot has no overflow check. > > Did we ever figure out if it's possible to trick Hotspot into doing a > JO instead of the raw bit-level operations? John/Christian/Tom: what > would it take to get HS to "know" that we're doing an integer > overflow-after-maths check and do the (faster?) JO? We already talked a bit about that some while ago. I think matching that double-xor-trick (or whatever it's called) is too risky. A JDK method that does the check (and the math?) would be nice so we can intrinsify it. GWT would fit here. Btw. what means JO? -- Christian > > Rémi: For what it's worth, I did specialize +/- for 1 and 2 in the > indy-based call protocols to avoid a full overflow check. > > - Charlie > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot
On Tue, Sep 6, 2011 at 10:39 AM, Rémi Forax wrote: > Yes, but don't forget that PHP.reboot has no overflow check. Did we ever figure out if it's possible to trick Hotspot into doing a JO instead of the raw bit-level operations? John/Christian/Tom: what would it take to get HS to "know" that we're doing an integer overflow-after-maths check and do the (faster?) JO? Rémi: For what it's worth, I did specialize +/- for 1 and 2 in the indy-based call protocols to avoid a full overflow check. - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot
On 09/06/2011 04:59 PM, Charles Oliver Nutter wrote: > Awesome numbers, especially promising for impls like JRuby that will > never have type annotations and for which type inference will be very > limited. Getting within 3x Java while still fully boxed is amazing. Yes, but don't forget that PHP.reboot has no overflow check. > Perhaps the next big thing for InDy will be getting EA working across > invokedynamic boundaries? Perhaps someone can explain why that's > difficult (or impossible) now, since it seems to me that generating a > bytecoded form for MH trees should allow EA to work as with any other > code. No? Good question :) > - Charlie Rémi ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot
Awesome numbers, especially promising for impls like JRuby that will never have type annotations and for which type inference will be very limited. Getting within 3x Java while still fully boxed is amazing. Perhaps the next big thing for InDy will be getting EA working across invokedynamic boundaries? Perhaps someone can explain why that's difficult (or impossible) now, since it seems to me that generating a bytecoded form for MH trees should allow EA to work as with any other code. No? - Charlie On Mon, Sep 5, 2011 at 8:09 AM, Rémi Forax wrote: > On 09/05/2011 12:22 PM, Christian Thalinger wrote: >> On Sep 5, 2011, at 1:11 AM, Rémi Forax wrote: >> >>> I've just compiled the hotspot (64bits server) using the hotspot-comp >>> workspace of hotspot express (hsx) >>> http://hg.openjdk.java.net/hsx/hotspot-comp/hotspot/ >>> >>> Here are the result when running PHP.reboot on fibonacci >>> (-server is the server VM of jdk1.7.0): >>> >>> Java: >>> java -server bigfibo 4.45 s >>> java -hsx bigfibo 4.44 s >>> >>> PHP.reboot (no type annotation) >>> phpr.sh -server bigfibo 22.72 s >>> phpr.sh -hsx bigfibo 13.61 s >>> >>> PHP.reboot (type specialization) >>> phpr.sh -server bigfibo 11.09 s >>> phpr.sh -hsx bigfibo 8.06 s >>> >>> PHP.reboot (user defined type annotation) >>> phpr.sh -server bigfibo2 6.96 s >>> phpr.sh -hsx bigfibo2 4.21 s >>> >>> PHP.reboot is an hybrid runtime, it starts with an interpreter >>> that walks the AST (really slow) and then compile to bytecode. >>> >>> The first test is with no type information provided by the user, >>> so all variables are object and invokedynamic is used for the >>> operations, the comparison and for function calls. >>> As you see, there is a huge speedup. >>> >>> The second test enables a flag that ask the runtime to try to >>> specialize the function at runtime. Because the algorithm >>> used is a fast-forward typechecker, the parameter of fibo >>> is san pecialized as int but the return type is still an object >>> (because fibo is recursive). >>> So basically here, invokedynamic is used for the function calls >>> and the + between the results of the function calls. >>> This '+' is a nasty one because the two parameters are objects, >>> so it requires a double guards. >>> You can see the speedup is nice too. >>> >>> The third test uses a file that declare the parameter type and >>> return type of fibo as int, so only the function calls are done >>> using invokedynamic. >>> You can also see the speedup and weirdly it's now faster than Java >>> (not a lot if compare the value but don't forget that >>> PHP.reboot starts in interpreter mode) so it's clearly faster. >>> I will take a look to the inlining tree to try to understand why, >>> it's maybe because fibo is a recursive call or because using >>> an invokedynamic which is resolved as an invokestatic >>> enables more inlining than just an invokestatic. >>> >>> John, Christian, Tom and all the others of the hotspot-comp team, >>> you make my day :) >> These numbers make mine too :-) Thanks for trying the current version. > > No, Thank you. > Frankly, it's amazing to see how something that was at a point only > in the collective mind of the JSR 292 EG is now real and fast > thanks to your ability to massage hotspot. > >> -- Christian > > Rémi > > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot
On 09/05/2011 12:22 PM, Christian Thalinger wrote: > On Sep 5, 2011, at 1:11 AM, Rémi Forax wrote: > >> I've just compiled the hotspot (64bits server) using the hotspot-comp >> workspace of hotspot express (hsx) >> http://hg.openjdk.java.net/hsx/hotspot-comp/hotspot/ >> >> Here are the result when running PHP.reboot on fibonacci >> (-server is the server VM of jdk1.7.0): >> >> Java: >> java -server bigfibo4.45 s >> java -hsx bigfibo 4.44 s >> >> PHP.reboot (no type annotation) >> phpr.sh -server bigfibo22.72 s >> phpr.sh -hsx bigfibo 13.61 s >> >> PHP.reboot (type specialization) >> phpr.sh -server bigfibo11.09 s >> phpr.sh -hsx bigfibo8.06 s >> >> PHP.reboot (user defined type annotation) >> phpr.sh -server bigfibo26.96 s >> phpr.sh -hsx bigfibo2 4.21 s >> >> PHP.reboot is an hybrid runtime, it starts with an interpreter >> that walks the AST (really slow) and then compile to bytecode. >> >> The first test is with no type information provided by the user, >> so all variables are object and invokedynamic is used for the >> operations, the comparison and for function calls. >> As you see, there is a huge speedup. >> >> The second test enables a flag that ask the runtime to try to >> specialize the function at runtime. Because the algorithm >> used is a fast-forward typechecker, the parameter of fibo >> is san pecialized as int but the return type is still an object >> (because fibo is recursive). >> So basically here, invokedynamic is used for the function calls >> and the + between the results of the function calls. >> This '+' is a nasty one because the two parameters are objects, >> so it requires a double guards. >> You can see the speedup is nice too. >> >> The third test uses a file that declare the parameter type and >> return type of fibo as int, so only the function calls are done >> using invokedynamic. >> You can also see the speedup and weirdly it's now faster than Java >> (not a lot if compare the value but don't forget that >> PHP.reboot starts in interpreter mode) so it's clearly faster. >> I will take a look to the inlining tree to try to understand why, >> it's maybe because fibo is a recursive call or because using >> an invokedynamic which is resolved as an invokestatic >> enables more inlining than just an invokestatic. >> >> John, Christian, Tom and all the others of the hotspot-comp team, >> you make my day :) > These numbers make mine too :-) Thanks for trying the current version. No, Thank you. Frankly, it's amazing to see how something that was at a point only in the collective mind of the JSR 292 EG is now real and fast thanks to your ability to massage hotspot. > -- Christian Rémi ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Hotspot loves PHP.reboot
On Sep 5, 2011, at 1:11 AM, Rémi Forax wrote: > I've just compiled the hotspot (64bits server) using the hotspot-comp > workspace of hotspot express (hsx) > http://hg.openjdk.java.net/hsx/hotspot-comp/hotspot/ > > Here are the result when running PHP.reboot on fibonacci > (-server is the server VM of jdk1.7.0): > > Java: > java -server bigfibo4.45 s > java -hsx bigfibo 4.44 s > > PHP.reboot (no type annotation) > phpr.sh -server bigfibo22.72 s > phpr.sh -hsx bigfibo 13.61 s > > PHP.reboot (type specialization) > phpr.sh -server bigfibo11.09 s > phpr.sh -hsx bigfibo8.06 s > > PHP.reboot (user defined type annotation) > phpr.sh -server bigfibo26.96 s > phpr.sh -hsx bigfibo2 4.21 s > > PHP.reboot is an hybrid runtime, it starts with an interpreter > that walks the AST (really slow) and then compile to bytecode. > > The first test is with no type information provided by the user, > so all variables are object and invokedynamic is used for the > operations, the comparison and for function calls. > As you see, there is a huge speedup. > > The second test enables a flag that ask the runtime to try to > specialize the function at runtime. Because the algorithm > used is a fast-forward typechecker, the parameter of fibo > is san pecialized as int but the return type is still an object > (because fibo is recursive). > So basically here, invokedynamic is used for the function calls > and the + between the results of the function calls. > This '+' is a nasty one because the two parameters are objects, > so it requires a double guards. > You can see the speedup is nice too. > > The third test uses a file that declare the parameter type and > return type of fibo as int, so only the function calls are done > using invokedynamic. > You can also see the speedup and weirdly it's now faster than Java > (not a lot if compare the value but don't forget that > PHP.reboot starts in interpreter mode) so it's clearly faster. > I will take a look to the inlining tree to try to understand why, > it's maybe because fibo is a recursive call or because using > an invokedynamic which is resolved as an invokestatic > enables more inlining than just an invokestatic. > > John, Christian, Tom and all the others of the hotspot-comp team, > you make my day :) These numbers make mine too :-) Thanks for trying the current version. -- Christian > > cheers, > Rémi > > ___ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Hotspot loves PHP.reboot
I've just compiled the hotspot (64bits server) using the hotspot-comp workspace of hotspot express (hsx) http://hg.openjdk.java.net/hsx/hotspot-comp/hotspot/ Here are the result when running PHP.reboot on fibonacci (-server is the server VM of jdk1.7.0): Java: java -server bigfibo4.45 s java -hsx bigfibo 4.44 s PHP.reboot (no type annotation) phpr.sh -server bigfibo22.72 s phpr.sh -hsx bigfibo 13.61 s PHP.reboot (type specialization) phpr.sh -server bigfibo11.09 s phpr.sh -hsx bigfibo8.06 s PHP.reboot (user defined type annotation) phpr.sh -server bigfibo26.96 s phpr.sh -hsx bigfibo2 4.21 s PHP.reboot is an hybrid runtime, it starts with an interpreter that walks the AST (really slow) and then compile to bytecode. The first test is with no type information provided by the user, so all variables are object and invokedynamic is used for the operations, the comparison and for function calls. As you see, there is a huge speedup. The second test enables a flag that ask the runtime to try to specialize the function at runtime. Because the algorithm used is a fast-forward typechecker, the parameter of fibo is san pecialized as int but the return type is still an object (because fibo is recursive). So basically here, invokedynamic is used for the function calls and the + between the results of the function calls. This '+' is a nasty one because the two parameters are objects, so it requires a double guards. You can see the speedup is nice too. The third test uses a file that declare the parameter type and return type of fibo as int, so only the function calls are done using invokedynamic. You can also see the speedup and weirdly it's now faster than Java (not a lot if compare the value but don't forget that PHP.reboot starts in interpreter mode) so it's clearly faster. I will take a look to the inlining tree to try to understand why, it's maybe because fibo is a recursive call or because using an invokedynamic which is resolved as an invokestatic enables more inlining than just an invokestatic. John, Christian, Tom and all the others of the hotspot-comp team, you make my day :) cheers, Rémi ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev