Interpreting Mission Control numbers for indy

2013-09-18 Thread Charles Oliver Nutter
I've been playing with JMC a bit tonight, running a user's application
that's about 2x slower using indy than using trivial monomorphic
caches (and no indy call sites). I'm trying to understand how to
interpret what I see.

In the Code/Overview results, where it lists hot packages, the #1
and #2 packages are java.lang.invoke.LambdaForm$MH and DMH, accounting
for over 37% of samples. That sounds high, but I'm willing to grant
they're hit pretty hard for a fully dynamic application.

Results in the Hot Methods tab show similar things, like
LambdaForm...invokeStatic_LL_L as the number one result and LambdaForm
entries dominating the top 50 entries in the profile. Again, I know
I'm hitting dynamic call sites hard and sampling is not always
accurate.

If I look at compilation events, I only see a handful of
LambdaForm...convert being compiled. I'm not sure if that's good or
bad. My assumption is that LFs don't show up here because they're
always being inlined into a caller.

The performance numbers for the app have me worried too. If I run
JRuby with stock settings, we will chain up to 6 call targets at a
call site. The lower I drop this number, the better performance gets;
when I drop all the way to zero, forcing all invokedynamic call sites
to fail over immediately to a monomorphic inline cache, performance
*almost* gets back to the non-indy implementation. This leads me to
believe that the less I use invokedynamic (or the fewer LFs involved),
the better. That doesn't bode well.

I believe the user would be happy to allow me to make these JMC
recordings available, and I'm happy to re-run with additional events
or gather other information. The JRuby community has a number of very
large applications that push the limits of indy. We should work
together to improve it.

- Charlie
___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: Interpreting Mission Control numbers for indy

2013-09-18 Thread Charles Oliver Nutter
A bit more on performance numbers for this application.

With no indy, monomorphic caches...the full application (a data load)
runs in about a minute. I fully recognize that this is a short run,
but JMC seems to indicate the bulk of code has compiled well before
the halfway point.

With 7u40 or 8, no tiered compilation, it takes about two minutes.

Tiered reduces non-indy time to 51s and indy time to 1m29s

Tiered + indy + only using monomorphic cache (no direct binding) runs
in 1m, still 9s slower than non-indy.

With normal settings, indy call sites do settle down and are mostly
monomorphic For the two phases of the data load, I stop seeing JRuby
bind indy call sites a couple seconds in.

There does not appear to be any difference in performance on this app
between 7u40 and 8b103.

Like I say...I think the user would be willing to share the
application, and I feel like the numbers warrant investigation.
Standing by! :-)

- Charlie

On Wed, Sep 18, 2013 at 10:39 AM, Charles Oliver Nutter
head...@headius.com wrote:
 I've been playing with JMC a bit tonight, running a user's application
 that's about 2x slower using indy than using trivial monomorphic
 caches (and no indy call sites). I'm trying to understand how to
 interpret what I see.

 In the Code/Overview results, where it lists hot packages, the #1
 and #2 packages are java.lang.invoke.LambdaForm$MH and DMH, accounting
 for over 37% of samples. That sounds high, but I'm willing to grant
 they're hit pretty hard for a fully dynamic application.

 Results in the Hot Methods tab show similar things, like
 LambdaForm...invokeStatic_LL_L as the number one result and LambdaForm
 entries dominating the top 50 entries in the profile. Again, I know
 I'm hitting dynamic call sites hard and sampling is not always
 accurate.

 If I look at compilation events, I only see a handful of
 LambdaForm...convert being compiled. I'm not sure if that's good or
 bad. My assumption is that LFs don't show up here because they're
 always being inlined into a caller.

 The performance numbers for the app have me worried too. If I run
 JRuby with stock settings, we will chain up to 6 call targets at a
 call site. The lower I drop this number, the better performance gets;
 when I drop all the way to zero, forcing all invokedynamic call sites
 to fail over immediately to a monomorphic inline cache, performance
 *almost* gets back to the non-indy implementation. This leads me to
 believe that the less I use invokedynamic (or the fewer LFs involved),
 the better. That doesn't bode well.

 I believe the user would be happy to allow me to make these JMC
 recordings available, and I'm happy to re-run with additional events
 or gather other information. The JRuby community has a number of very
 large applications that push the limits of indy. We should work
 together to improve it.

 - Charlie
___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: Reproducible InternalError in lambda stuff

2013-09-18 Thread Christian Thalinger

On Sep 16, 2013, at 2:59 AM, Charles Oliver Nutter head...@headius.com wrote:

 On Mon, Sep 16, 2013 at 2:36 AM, John Rose john.r.r...@oracle.com wrote:
 I have refreshed mlvm-dev and pushed some patches to it which may address
 this problem.
 
 I'll get a build put together and see if I can get users to test it.
 
 If you have time, please give them a try.  Do hg qgoto meth-lfc.patch.
 
 If this stuff helps we would like to work towards a fix in 7u.
 
 What is your time frame for JRuby 1.7.5?
 
 It is on hold indefinitely while we work out user-reported issues
 (most are not 7u40-related, but we'd like to have an answer for those
 before release too).
 
 I've attached one user's hs_err dump. This was with a 4GB heap. Code
 cache full

You mean perm gen, right?

 and failing spectacularly?
 
 - Charlie
 hs_err_pid1184.log___
 mlvm-dev mailing list
 mlvm-dev@openjdk.java.net
 http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev

___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: RFR (L) 8024761: JSR 292 improve performance of generic invocation

2013-09-18 Thread Christian Thalinger
src/share/classes/java/lang/invoke/CallSite.java:

+if (3 + argv.length  MethodType.MAX_MH_ARITY)
+MethodType invocationType = MethodType.genericMethodType(3 
+ argv.length);
+MethodHandle spreader = 
invocationType.invokers().spreadInvoker(3);

Could we use a defined constant for 3?

src/share/classes/java/lang/invoke/Invokers.java:

+if (targetType == targetType.erase()  targetType.parameterCount()  
10)

The same here for 10.

Actually, exactInvoker and generalInvoker's code could be factored into one 
method.

+/*non-public*/ MethodHandle basicInvoker() {
+//invoker.form.compileToBytecode();

Please remove commented lines.

+static MemberName exactInvokeLinkerMethod(MethodType mtype, Object[] 
appendixResult) {
+static MemberName genericInvokeLinkerMethod(MethodType mtype, Object[] 
appendixResult) {

These two could also be factored into one method.

+// Return an adapter for invokeExact or generic invoke, as a MH or 
constant pool linker
+// mtype : the caller's method type (either basic or full-custom)
+// customized : whether to use a trailing appendix argument (to carry the 
mtype)
+// which0x01 : whether it is a CP adapter (linker) or MHs.invoker value 
(invoker)
+// which0x02 : whether it is for invokeExact or generic invoke
+//
+// If !customized, caller is responsible for supplying, during adapter 
execution,
+// a copy of the exact mtype.  This is because the adapter might be 
generalized to
+// a basic type.
+private static LambdaForm invokeHandleForm(MethodType mtype, boolean 
customized, int which) {

Why are you not using Javadoc style for this method comment?  It's still 
helpful in IDEs.

src/share/classes/java/lang/invoke/LambdaForm.java:

 static void traceInterpreter(String event, Object obj, Object... args) {
+if (!(TRACE_INTERPRETER  INIT_DONE))  return;

Why not use the same pattern:

+if (TRACE_INTERPRETER  INIT_DONE)

as the other places?

+static final boolean INIT_DONE = Boolean.TRUE.booleanValue();

Why are we having this after all?

src/share/classes/java/lang/invoke/MemberName.java:

+public MemberName asNormalOriginal() {

Could you add a comment to this method?  It's not clear to me what normal and 
original mean here.

+public MemberName(byte refKind, Class? defClass, String name, Object 
type) {
+@SuppressWarnings(LocalVariableHidesMemberVariable)
+int kindFlags;

The SuppressWarnings is in the other constructors because of the name flags.  
You don't need it here.  Maybe rename the others as well and get rid of the 
annotation.

src/share/classes/java/lang/invoke/MethodHandleNatives.java:

 static String refKindName(byte refKind) {
 assert(refKindIsValid(refKind));
-return REFERENCE_KIND_NAME[refKind];
+switch (refKind) {

After this change REFERENCE_KIND_NAME is not used anymore.

src/share/classes/java/lang/invoke/MethodHandles.java:

+member.getName().getClass(); member.getType().getClass();  // NPE

Please don't!  It would be nice to have at least a different line number in the 
backtrace.

src/share/classes/java/lang/invoke/MethodTypeForm.java:

+//Lookup.findVirtual(MethodHandle.class, name, type);

Either remove it or add a comment why it's there.

On Sep 12, 2013, at 6:36 PM, John Rose john.r.r...@oracle.com wrote:

 Please review this change for a change to the JSR 292 implementation:
 
 http://cr.openjdk.java.net/~jrose/8024761/webrev.00/
 
 Bug description:  The performance of MethodHandle.invoke is very slow when 
 the call site type differs from the method handle type. When the types 
 differ, the invocation is defined to proceed as if two steps were taken: 
 
 1. the target method handle is first adjusted to the call site type, by 
 MethodHandles.asType 
 
 2. the type-adjusted method handle is invoked directly, by 
 MethodHandles.invokeExact 
 
 The existing code (from JDK 7) awkwardly performs the type adjustment on 
 every call. But performing the type analysis and adapter creation on every 
 call is inherently slow. A good fix is to cache the result of step 1 
 (MethodHandles.asType), since step 2 is already reasonably fast. 
 
 For most applications, a one-element cache on each individual method handle 
 is a reasonable choice. It has the particular advantage of speeding up 
 invocations of non-varargs bootstrap methods. To benefit from this, the 
 bootstrap methods themselves need to be uniquified across multiple class 
 files, so this work will also include a cache to benefit commonly-used 
 bootstrap methods, such as JDK 8's LambdaMetafactory.metafactory. 
 
 Additional caches could be based on the call site, the call site type, the 
 target type, or the target's MH.form. 
 
 Thanks,
 — John
 
 P.S. Since this is an implementation change oriented toward performance, the 
 review request is to mlvm-dev and 

Re: RFR (S) 8001108: an attempt to use init as a method name should elicit NoSuchMethodException

2013-09-18 Thread Christian Thalinger
src/share/classes/java/lang/invoke/MethodHandles.java:

+ * methods as if they were normal methods, but the JVM verifier rejects 
them.

I think you should say JVM byte code verifier here.

+ * em(Note:  JVM internal methods named {@code init} not visible 
to this API,
+ * even though the {@code invokespecial} instruction can refer to them
+ * in special circumstances.  Use {@link #findConstructor 
findConstructor}

Not exactly sure but should this read are not visible?

 MemberName resolveOrFail(byte refKind, Class? refc, String name, 
MethodType type) throws NoSuchMethodException, IllegalAccessException {
+type.getClass();  // NPE
 checkSymbolicClass(refc);  // do this before attempting to resolve
-name.getClass(); type.getClass();  // NPE
+checkMethodName(refKind, name);

Could you add a comment here saying that checkMethodName does an implicit null 
pointer check on name?  Maybe a comment for checkMethodName too.

What was the problem with the null pointer exceptions?  Is it okay that we 
might throw another exception before null checking name?

On Sep 12, 2013, at 6:47 PM, John Rose john.r.r...@oracle.com wrote:

 Please review this change for a change to the JSR 292 implementation:
 
 http://cr.openjdk.java.net/~jrose/8001108/webrev.00
 
 Summary: add an explicit check for leading , upgrade the unit tests
 
 The change is mostly to javadoc and unit tests, documenting and testing some 
 corner cases of JSR 292 APIs.
 
 In the RI, there is an explicit error check.
 
 Thanks,
 — John
 
 P.S. Since this is a change which oriented toward JSR 292 functionality, the 
 review request is to mlvm-dev and core-libs-dev.
 Changes which are oriented toward performance will go to mlvm-dev and 
 hotspot-compiler-dev.
 ___
 mlvm-dev mailing list
 mlvm-dev@openjdk.java.net
 http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev

___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: Interpreting Mission Control numbers for indy

2013-09-18 Thread Christian Thalinger

On Sep 18, 2013, at 1:39 AM, Charles Oliver Nutter head...@headius.com wrote:

 I've been playing with JMC a bit tonight, running a user's application
 that's about 2x slower using indy than using trivial monomorphic
 caches (and no indy call sites). I'm trying to understand how to
 interpret what I see.
 
 In the Code/Overview results, where it lists hot packages, the #1
 and #2 packages are java.lang.invoke.LambdaForm$MH and DMH, accounting
 for over 37% of samples. That sounds high, but I'm willing to grant
 they're hit pretty hard for a fully dynamic application.
 
 Results in the Hot Methods tab show similar things, like
 LambdaForm...invokeStatic_LL_L as the number one result and LambdaForm
 entries dominating the top 50 entries in the profile. Again, I know
 I'm hitting dynamic call sites hard and sampling is not always
 accurate.
 
 If I look at compilation events, I only see a handful of
 LambdaForm...convert being compiled. I'm not sure if that's good or
 bad. My assumption is that LFs don't show up here because they're
 always being inlined into a caller.
 
 The performance numbers for the app have me worried too. If I run
 JRuby with stock settings, we will chain up to 6 call targets at a
 call site. The lower I drop this number, the better performance gets;
 when I drop all the way to zero, forcing all invokedynamic call sites
 to fail over immediately to a monomorphic inline cache, performance
 *almost* gets back to the non-indy implementation. This leads me to
 believe that the less I use invokedynamic (or the fewer LFs involved),
 the better. That doesn't bode well.
 
 I believe the user would be happy to allow me to make these JMC
 recordings available, and I'm happy to re-run with additional events
 or gather other information. The JRuby community has a number of very
 large applications that push the limits of indy. We should work
 together to improve it.

We talked about this in the past.  Can we somehow get access to one of these 
large applications?

 
 - Charlie
 ___
 mlvm-dev mailing list
 mlvm-dev@openjdk.java.net
 http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev

___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev