I am new to Openjdk and mlvm, but I have decades of experience in benchmark
cheating and hackery.
(No, that is not what I am bringing to OpenJDK.)
I went to look at Caliper, and in the tutorial, I saw this:
return dummy; // framework ignores this, but it has served its purpose!
This is not at all confidence-inspiring; inlining of the call, or even
conditional inlining,
allows the deadness of dummy to be detected, and then your benchmark is screwed.
The framework should print dummy, to a real live file, not /dev/null (yes, once
upon a
time a workstation vendor spotted the /dev/null case and short-circuited all of
libcurses
out of the way). I wouldn't trust reflective access to be sufficiently
obfuscating; if I used
reflection and it could be optimized into a direct call, I would like that, so
I should not
be surprised if an optimizer picked up that transformation and it accidentally
messed
up this benchmark.
Sorry if I seem skeptical, but I've seen intelligent people write terrible
benchmarks.
Consider the old JavaGrande Fork-Join benchmark:
// do something trivial but which won't be optimised away!
double theta=37.2, sint, res;
sint = Math.sin(theta);
res = sint*sint;
//defeat dead code elimination
if(res <= 0) System.out.println(
"Benchmark exited with unrealistic res value " + res);
Strictfp, Math.sin of theta is -0.478645918588415 -- that's straight from the
spec, if you simply
follow the recipe. Even with widefp, the optimizer knows the target FP, and
constant propagate
there, too. I think you can figure out the rest.
Sorry if I seem skeptical, but you've got to be very careful with
microbenchmarks.
They say they're careful, but I looked at their examples, and by my standards,
they're not careful enough.
David
On 2012-10-15, at 9:12 PM, Ashwin Jayaprakash <[email protected]>
wrote:
> People seem to be skeptical about the micro benchmarks I posted. This is why
> I used Caliper (http://code.google.com/p/caliper/).
>
> Caliper runs each test multiple times (indicated by "trials"). Each trial
> itself runs the code in a loop with "reps", which loops for 10s of millions
> of times. So, I'm uploading the log files for your verification. Look for
> statements like "running trial with 138668464 reps" in the files. Caliper
> does run the test enough times to let the JIT warm up.
>
> JDK 8:
> 0% Scenario{vm=java, trial=0, benchmark=Reflect, tune=-server -Xmx96M
> -Xmx96M, tier=-XX:+TieredCompilation} 19.88 ns; ?=0.90 ns @ 10 trials
> 10% Scenario{vm=java, trial=0, benchmark=Handle, tune=-server -Xmx96M
> -Xmx96M, tier=-XX:+TieredCompilation} 16.06 ns; ?=0.09 ns @ 3 trials
> 20% Scenario{vm=java, trial=0, benchmark=Direct, tune=-server -Xmx96M
> -Xmx96M, tier=-XX:+TieredCompilation} 7.41 ns; ?=0.01 ns @ 3 trials
> 30% Scenario{vm=java, trial=0, benchmark=Iface, tune=-server -Xmx96M -Xmx96M,
> tier=-XX:+TieredCompilation} 7.27 ns; ?=0.06 ns @ 3 trials
> 40% Scenario{vm=java, trial=0, benchmark=Static, tune=-server -Xmx96M
> -Xmx96M, tier=-XX:+TieredCompilation} 7.33 ns; ?=0.01 ns @ 3 trials
> 50% Scenario{vm=java, trial=0, benchmark=Reflect, tune=-server -Xmx96M
> -Xmx96M, tier=-XX:-TieredCompilation} 18.55 ns; ?=1.88 ns @ 10 trials
> 60% Scenario{vm=java, trial=0, benchmark=Handle, tune=-server -Xmx96M
> -Xmx96M, tier=-XX:-TieredCompilation} 15.13 ns; ?=0.06 ns @ 3 trials
> 70% Scenario{vm=java, trial=0, benchmark=Direct, tune=-server -Xmx96M
> -Xmx96M, tier=-XX:-TieredCompilation} 7.21 ns; ?=0.07 ns @ 4 trials
> 80% Scenario{vm=java, trial=0, benchmark=Iface, tune=-server -Xmx96M -Xmx96M,
> tier=-XX:-TieredCompilation} 7.23 ns; ?=0.07 ns @ 9 trials
> 90% Scenario{vm=java, trial=0, benchmark=Static, tune=-server -Xmx96M
> -Xmx96M, tier=-XX:-TieredCompilation} 7.20 ns; ?=0.02 ns @ 3 trials
>
> benchmark tier ns linear runtime
> Reflect -XX:+TieredCompilation 19.88 ==============================
> Reflect -XX:-TieredCompilation 18.55 ===========================
> Handle -XX:+TieredCompilation 16.06 ========================
> Handle -XX:-TieredCompilation 15.13 ======================
> Direct -XX:+TieredCompilation 7.41 ===========
> Direct -XX:-TieredCompilation 7.21 ==========
> Iface -XX:+TieredCompilation 7.27 ==========
> Iface -XX:-TieredCompilation 7.23 ==========
> Static -XX:+TieredCompilation 7.33 ===========
> Static -XX:-TieredCompilation 7.20 ==========
>
> vm: java
> trial: 0
> tune: -server -Xmx96M -Xmx96M
>
> Writing results to C:\temp\jdk_8_ea_b59.log
>
>
> JDK 7:
> 0% Scenario{vm=java, trial=0, benchmark=Reflect, tune=-server -Xmx96M
> -Xmx96M, tier=-XX:+TieredCompilation} 16.40 ns; ?=0.16 ns @ 7 trials
> 10% Scenario{vm=java, trial=0, benchmark=Handle, tune=-server -Xmx96M
> -Xmx96M, tier=-XX:+TieredCompilation} 20.89 ns; ?=0.64 ns @ 10 trials
> 20% Scenario{vm=java, trial=0, benchmark=Direct, tune=-server -Xmx96M
> -Xmx96M, tier=-XX:+TieredCompilation} 4.81 ns; ?=0.04 ns @ 3 trials
> 30% Scenario{vm=java, trial=0, benchmark=Iface, tune=-server -Xmx96M -Xmx96M,
> tier=-XX:+TieredCompilation} 4.86 ns; ?=0.05 ns @ 3 trials
> 40% Scenario{vm=java, trial=0, benchmark=Static, tune=-server -Xmx96M
> -Xmx96M, tier=-XX:+TieredCompilation} 4.84 ns; ?=0.04 ns @ 3 trials
> 50% Scenario{vm=java, trial=0, benchmark=Reflect, tune=-server -Xmx96M
> -Xmx96M, tier=-XX:-TieredCompilation} 16.55 ns; ?=0.15 ns @ 4 trials
> 60% Scenario{vm=java, trial=0, benchmark=Handle, tune=-server -Xmx96M
> -Xmx96M, tier=-XX:-TieredCompilation} 20.96 ns; ?=0.59 ns @ 10 trials
> 70% Scenario{vm=java, trial=0, benchmark=Direct, tune=-server -Xmx96M
> -Xmx96M, tier=-XX:-TieredCompilation} 4.79 ns; ?=0.01 ns @ 3 trials
> 80% Scenario{vm=java, trial=0, benchmark=Iface, tune=-server -Xmx96M -Xmx96M,
> tier=-XX:-TieredCompilation} 4.80 ns; ?=0.03 ns @ 3 trials
> 90% Scenario{vm=java, trial=0, benchmark=Static, tune=-server -Xmx96M
> -Xmx96M, tier=-XX:-TieredCompilation} 4.85 ns; ?=0.05 ns @ 7 trials
>
> benchmark tier ns linear runtime
> Reflect -XX:+TieredCompilation 16.40 =======================
> Reflect -XX:-TieredCompilation 16.55 =======================
> Handle -XX:+TieredCompilation 20.89 =============================
> Handle -XX:-TieredCompilation 20.96 ==============================
> Direct -XX:+TieredCompilation 4.81 ======
> Direct -XX:-TieredCompilation 4.79 ======
> Iface -XX:+TieredCompilation 4.86 ======
> Iface -XX:-TieredCompilation 4.80 ======
> Static -XX:+TieredCompilation 4.84 ======
> Static -XX:-TieredCompilation 4.85 ======
>
> vm: java
> trial: 0
> tune: -server -Xmx96M -Xmx96M
>
> Writing results to C:\temp\jdk_7u7.log
>
>
> Regards,
> Ashwin.
>
>
>
>
> <jdk_7u7.log><jdk_8_ea_b59.log>_______________________________________________
> mlvm-dev mailing list
> [email protected]
> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
_______________________________________________
mlvm-dev mailing list
[email protected]
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev