Interpreting Mission Control numbers for indy
I've been playing with JMC a bit tonight, running a user's application that's about 2x slower using indy than using trivial monomorphic caches (and no indy call sites). I'm trying to understand how to interpret what I see. In the Code/Overview results, where it lists hot packages, the #1 and #2 packages are java.lang.invoke.LambdaForm$MH and DMH, accounting for over 37% of samples. That sounds high, but I'm willing to grant they're hit pretty hard for a fully dynamic application. Results in the Hot Methods tab show similar things, like LambdaForm...invokeStatic_LL_L as the number one result and LambdaForm entries dominating the top 50 entries in the profile. Again, I know I'm hitting dynamic call sites hard and sampling is not always accurate. If I look at compilation events, I only see a handful of LambdaForm...convert being compiled. I'm not sure if that's good or bad. My assumption is that LFs don't show up here because they're always being inlined into a caller. The performance numbers for the app have me worried too. If I run JRuby with stock settings, we will chain up to 6 call targets at a call site. The lower I drop this number, the better performance gets; when I drop all the way to zero, forcing all invokedynamic call sites to fail over immediately to a monomorphic inline cache, performance *almost* gets back to the non-indy implementation. This leads me to believe that the less I use invokedynamic (or the fewer LFs involved), the better. That doesn't bode well. I believe the user would be happy to allow me to make these JMC recordings available, and I'm happy to re-run with additional events or gather other information. The JRuby community has a number of very large applications that push the limits of indy. We should work together to improve it. - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Interpreting Mission Control numbers for indy
A bit more on performance numbers for this application. With no indy, monomorphic caches...the full application (a data load) runs in about a minute. I fully recognize that this is a short run, but JMC seems to indicate the bulk of code has compiled well before the halfway point. With 7u40 or 8, no tiered compilation, it takes about two minutes. Tiered reduces non-indy time to 51s and indy time to 1m29s Tiered + indy + only using monomorphic cache (no direct binding) runs in 1m, still 9s slower than non-indy. With normal settings, indy call sites do settle down and are mostly monomorphic For the two phases of the data load, I stop seeing JRuby bind indy call sites a couple seconds in. There does not appear to be any difference in performance on this app between 7u40 and 8b103. Like I say...I think the user would be willing to share the application, and I feel like the numbers warrant investigation. Standing by! :-) - Charlie On Wed, Sep 18, 2013 at 10:39 AM, Charles Oliver Nutter head...@headius.com wrote: I've been playing with JMC a bit tonight, running a user's application that's about 2x slower using indy than using trivial monomorphic caches (and no indy call sites). I'm trying to understand how to interpret what I see. In the Code/Overview results, where it lists hot packages, the #1 and #2 packages are java.lang.invoke.LambdaForm$MH and DMH, accounting for over 37% of samples. That sounds high, but I'm willing to grant they're hit pretty hard for a fully dynamic application. Results in the Hot Methods tab show similar things, like LambdaForm...invokeStatic_LL_L as the number one result and LambdaForm entries dominating the top 50 entries in the profile. Again, I know I'm hitting dynamic call sites hard and sampling is not always accurate. If I look at compilation events, I only see a handful of LambdaForm...convert being compiled. I'm not sure if that's good or bad. My assumption is that LFs don't show up here because they're always being inlined into a caller. The performance numbers for the app have me worried too. If I run JRuby with stock settings, we will chain up to 6 call targets at a call site. The lower I drop this number, the better performance gets; when I drop all the way to zero, forcing all invokedynamic call sites to fail over immediately to a monomorphic inline cache, performance *almost* gets back to the non-indy implementation. This leads me to believe that the less I use invokedynamic (or the fewer LFs involved), the better. That doesn't bode well. I believe the user would be happy to allow me to make these JMC recordings available, and I'm happy to re-run with additional events or gather other information. The JRuby community has a number of very large applications that push the limits of indy. We should work together to improve it. - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Reproducible InternalError in lambda stuff
On Sep 16, 2013, at 2:59 AM, Charles Oliver Nutter head...@headius.com wrote: On Mon, Sep 16, 2013 at 2:36 AM, John Rose john.r.r...@oracle.com wrote: I have refreshed mlvm-dev and pushed some patches to it which may address this problem. I'll get a build put together and see if I can get users to test it. If you have time, please give them a try. Do hg qgoto meth-lfc.patch. If this stuff helps we would like to work towards a fix in 7u. What is your time frame for JRuby 1.7.5? It is on hold indefinitely while we work out user-reported issues (most are not 7u40-related, but we'd like to have an answer for those before release too). I've attached one user's hs_err dump. This was with a 4GB heap. Code cache full You mean perm gen, right? and failing spectacularly? - Charlie hs_err_pid1184.log___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: RFR (L) 8024761: JSR 292 improve performance of generic invocation
src/share/classes/java/lang/invoke/CallSite.java: +if (3 + argv.length MethodType.MAX_MH_ARITY) +MethodType invocationType = MethodType.genericMethodType(3 + argv.length); +MethodHandle spreader = invocationType.invokers().spreadInvoker(3); Could we use a defined constant for 3? src/share/classes/java/lang/invoke/Invokers.java: +if (targetType == targetType.erase() targetType.parameterCount() 10) The same here for 10. Actually, exactInvoker and generalInvoker's code could be factored into one method. +/*non-public*/ MethodHandle basicInvoker() { +//invoker.form.compileToBytecode(); Please remove commented lines. +static MemberName exactInvokeLinkerMethod(MethodType mtype, Object[] appendixResult) { +static MemberName genericInvokeLinkerMethod(MethodType mtype, Object[] appendixResult) { These two could also be factored into one method. +// Return an adapter for invokeExact or generic invoke, as a MH or constant pool linker +// mtype : the caller's method type (either basic or full-custom) +// customized : whether to use a trailing appendix argument (to carry the mtype) +// which0x01 : whether it is a CP adapter (linker) or MHs.invoker value (invoker) +// which0x02 : whether it is for invokeExact or generic invoke +// +// If !customized, caller is responsible for supplying, during adapter execution, +// a copy of the exact mtype. This is because the adapter might be generalized to +// a basic type. +private static LambdaForm invokeHandleForm(MethodType mtype, boolean customized, int which) { Why are you not using Javadoc style for this method comment? It's still helpful in IDEs. src/share/classes/java/lang/invoke/LambdaForm.java: static void traceInterpreter(String event, Object obj, Object... args) { +if (!(TRACE_INTERPRETER INIT_DONE)) return; Why not use the same pattern: +if (TRACE_INTERPRETER INIT_DONE) as the other places? +static final boolean INIT_DONE = Boolean.TRUE.booleanValue(); Why are we having this after all? src/share/classes/java/lang/invoke/MemberName.java: +public MemberName asNormalOriginal() { Could you add a comment to this method? It's not clear to me what normal and original mean here. +public MemberName(byte refKind, Class? defClass, String name, Object type) { +@SuppressWarnings(LocalVariableHidesMemberVariable) +int kindFlags; The SuppressWarnings is in the other constructors because of the name flags. You don't need it here. Maybe rename the others as well and get rid of the annotation. src/share/classes/java/lang/invoke/MethodHandleNatives.java: static String refKindName(byte refKind) { assert(refKindIsValid(refKind)); -return REFERENCE_KIND_NAME[refKind]; +switch (refKind) { After this change REFERENCE_KIND_NAME is not used anymore. src/share/classes/java/lang/invoke/MethodHandles.java: +member.getName().getClass(); member.getType().getClass(); // NPE Please don't! It would be nice to have at least a different line number in the backtrace. src/share/classes/java/lang/invoke/MethodTypeForm.java: +//Lookup.findVirtual(MethodHandle.class, name, type); Either remove it or add a comment why it's there. On Sep 12, 2013, at 6:36 PM, John Rose john.r.r...@oracle.com wrote: Please review this change for a change to the JSR 292 implementation: http://cr.openjdk.java.net/~jrose/8024761/webrev.00/ Bug description: The performance of MethodHandle.invoke is very slow when the call site type differs from the method handle type. When the types differ, the invocation is defined to proceed as if two steps were taken: 1. the target method handle is first adjusted to the call site type, by MethodHandles.asType 2. the type-adjusted method handle is invoked directly, by MethodHandles.invokeExact The existing code (from JDK 7) awkwardly performs the type adjustment on every call. But performing the type analysis and adapter creation on every call is inherently slow. A good fix is to cache the result of step 1 (MethodHandles.asType), since step 2 is already reasonably fast. For most applications, a one-element cache on each individual method handle is a reasonable choice. It has the particular advantage of speeding up invocations of non-varargs bootstrap methods. To benefit from this, the bootstrap methods themselves need to be uniquified across multiple class files, so this work will also include a cache to benefit commonly-used bootstrap methods, such as JDK 8's LambdaMetafactory.metafactory. Additional caches could be based on the call site, the call site type, the target type, or the target's MH.form. Thanks, — John P.S. Since this is an implementation change oriented toward performance, the review request is to mlvm-dev and
Re: RFR (S) 8001108: an attempt to use init as a method name should elicit NoSuchMethodException
src/share/classes/java/lang/invoke/MethodHandles.java: + * methods as if they were normal methods, but the JVM verifier rejects them. I think you should say JVM byte code verifier here. + * em(Note: JVM internal methods named {@code init} not visible to this API, + * even though the {@code invokespecial} instruction can refer to them + * in special circumstances. Use {@link #findConstructor findConstructor} Not exactly sure but should this read are not visible? MemberName resolveOrFail(byte refKind, Class? refc, String name, MethodType type) throws NoSuchMethodException, IllegalAccessException { +type.getClass(); // NPE checkSymbolicClass(refc); // do this before attempting to resolve -name.getClass(); type.getClass(); // NPE +checkMethodName(refKind, name); Could you add a comment here saying that checkMethodName does an implicit null pointer check on name? Maybe a comment for checkMethodName too. What was the problem with the null pointer exceptions? Is it okay that we might throw another exception before null checking name? On Sep 12, 2013, at 6:47 PM, John Rose john.r.r...@oracle.com wrote: Please review this change for a change to the JSR 292 implementation: http://cr.openjdk.java.net/~jrose/8001108/webrev.00 Summary: add an explicit check for leading , upgrade the unit tests The change is mostly to javadoc and unit tests, documenting and testing some corner cases of JSR 292 APIs. In the RI, there is an explicit error check. Thanks, — John P.S. Since this is a change which oriented toward JSR 292 functionality, the review request is to mlvm-dev and core-libs-dev. Changes which are oriented toward performance will go to mlvm-dev and hotspot-compiler-dev. ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Re: Interpreting Mission Control numbers for indy
On Sep 18, 2013, at 1:39 AM, Charles Oliver Nutter head...@headius.com wrote: I've been playing with JMC a bit tonight, running a user's application that's about 2x slower using indy than using trivial monomorphic caches (and no indy call sites). I'm trying to understand how to interpret what I see. In the Code/Overview results, where it lists hot packages, the #1 and #2 packages are java.lang.invoke.LambdaForm$MH and DMH, accounting for over 37% of samples. That sounds high, but I'm willing to grant they're hit pretty hard for a fully dynamic application. Results in the Hot Methods tab show similar things, like LambdaForm...invokeStatic_LL_L as the number one result and LambdaForm entries dominating the top 50 entries in the profile. Again, I know I'm hitting dynamic call sites hard and sampling is not always accurate. If I look at compilation events, I only see a handful of LambdaForm...convert being compiled. I'm not sure if that's good or bad. My assumption is that LFs don't show up here because they're always being inlined into a caller. The performance numbers for the app have me worried too. If I run JRuby with stock settings, we will chain up to 6 call targets at a call site. The lower I drop this number, the better performance gets; when I drop all the way to zero, forcing all invokedynamic call sites to fail over immediately to a monomorphic inline cache, performance *almost* gets back to the non-indy implementation. This leads me to believe that the less I use invokedynamic (or the fewer LFs involved), the better. That doesn't bode well. I believe the user would be happy to allow me to make these JMC recordings available, and I'm happy to re-run with additional events or gather other information. The JRuby community has a number of very large applications that push the limits of indy. We should work together to improve it. We talked about this in the past. Can we somehow get access to one of these large applications? - Charlie ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev