Since we're in wishful thinking territory now :), the two things I'd really like are:
1) value/struct types (i.e. avoid heap and be able to pack data closer together). I don't how much we can rely on EA. 2) more auto-vectorization I think 2 is being worked on by Vladimir but unclear if there are any concrete plans for 1. I know John Rose has written about it, but don't know if anything's actually planned. Sent from my phone On Sep 28, 2012 3:59 PM, "Charles Oliver Nutter" <head...@headius.com> wrote: > Now what we need is a way to inject new intrinsics into the JVM, so I > can make an asm version of something and tell hotspot "no no, use > this, not the JVM bytecode" :) > > - Charlie > > On Fri, Sep 28, 2012 at 11:53 AM, Vitaly Davidovich <vita...@gmail.com> > wrote: > > Yup, it would have to do extensive pattern matching otherwise. C/C++ > > compilers do the same thing (I.e. have intimate knowledge of stdlib calls > > and may optimize more aggressively or replace code with intrinsic > > altogether). > > > > In this case, jit uses the bsf x86 assembly instruction whereas hand > rolled > > "copy version" generates asm pretty much matching the java code. > > > > Sent from my phone > > > > On Sep 28, 2012 2:42 PM, "Raffaello Giulietti" > > <raffaello.giulie...@gmail.com> wrote: > >> > >> On Fri, Sep 28, 2012 at 8:15 PM, Charles Oliver Nutter > >> <head...@headius.com> wrote: > >> > On Fri, Sep 28, 2012 at 10:21 AM, Raffaello Giulietti > >> > <raffaello.giulie...@gmail.com> wrote: > >> >> I'm not sure that we are speaking about the same thing. > >> >> > >> >> The Java source code of numberOfTrailingZeros() is exactly the same > in > >> >> Integer as it is in MyInteger. But, as far as I understand, what > >> >> really runs on the metal upon invocation of the Integer method is not > >> >> JITted code but something else that probably makes use of CPU > specific > >> >> instructions. This code is built directly into the JVM and need not > >> >> bear any resemblance with the code that would have been produced by > >> >> JITting the bytecode. > >> > > >> > Regardless of whether the method is implemented in Java or not, the > >> > JVM "knows" native/intrinsic/optimized versions of many java.lang core > >> > methods. numberOfTrailingZeros is one such method. > >> > > >> > Here, the JVM is using its intrinsified version rather than the JITed > >> > version, presumably because the intrinsified version is pre-optimized > >> > and faster than what the JVM JIT can do for the JVM bytecode version. > >> > > >> > system ~/projects/jruby-ruby $ java -XX:+PrintCompilation > >> > -XX:+UnlockDiagnosticVMOptions -XX:+PrintInlining Blah > >> > 65 1 java.lang.String::hashCode (55 bytes) > >> > 78 2 Blah::doIt (5 bytes) > >> > 78 3 java.lang.Integer::numberOfTrailingZeros (79 > >> > bytes) > >> > @ 1 > >> > java.lang.Integer::numberOfTrailingZeros (79 bytes) (intrinsic) > >> > 79 1 % Blah::main @ 2 (29 bytes) > >> > @ 9 Blah::doIt (5 bytes) inline (hot) > >> > @ 1 > >> > java.lang.Integer::numberOfTrailingZeros (79 bytes) (intrinsic) > >> > @ 15 Blah::doIt (5 bytes) inline (hot) > >> > @ 1 > >> > java.lang.Integer::numberOfTrailingZeros (79 bytes) (intrinsic) > >> > > >> > system ~/projects/jruby-ruby $ cat Blah.java > >> > public class Blah { > >> > public static int value = 0; > >> > public static void main(String[] args) { > >> > for (int i = 0; i < 10_000_000; i++) { > >> > value = doIt(i) + doIt(i * 2); > >> > } > >> > } > >> > > >> > public static int doIt(int i) { > >> > return Integer.numberOfTrailingZeros(i); > >> > } > >> > } > >> > _______________________________________________ > >> > >> > >> Yes, this is what Vitaly stated and what happens behind the curtains. > >> > >> In the end, this means there are no chances for the rest of us to > >> implement better Java code as a replacement for the intrinsified > >> methods. > >> > >> For example, the following variant is about 2.5 times *faster*, > >> averaged over all integers, than the JITted original method, the one > >> copied verbatim! (Besides, everybody would agree that it is more > >> readable, I hope.) > >> > >> But since the Integer version is intrinsified, it still runs about 2 > >> times slower than that (mysterious) code. > >> > >> public static int numberOfTrailingZeros(int i) { > >> int n = 0; > >> for (; n < 32 && (i & 1 << n) == 0; ++n); > >> return n; > >> } > >> _______________________________________________ > >> mlvm-dev mailing list > >> mlvm-dev@openjdk.java.net > >> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > > > > > > _______________________________________________ > > mlvm-dev mailing list > > mlvm-dev@openjdk.java.net > > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > > > _______________________________________________ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev >
_______________________________________________ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev