Since we're in wishful thinking territory now :), the two things I'd really
like are:

1) value/struct types (i.e. avoid heap and be able to pack data closer
together).  I don't how much we can rely on EA.
2) more auto-vectorization

I think 2 is being worked on by Vladimir but unclear if there are any
concrete plans for 1.  I know John Rose has written about it, but don't
know if anything's actually planned.

Sent from my phone
On Sep 28, 2012 3:59 PM, "Charles Oliver Nutter" <head...@headius.com>
wrote:

> Now what we need is a way to inject new intrinsics into the JVM, so I
> can make an asm version of something and tell hotspot "no no, use
> this, not the JVM bytecode" :)
>
> - Charlie
>
> On Fri, Sep 28, 2012 at 11:53 AM, Vitaly Davidovich <vita...@gmail.com>
> wrote:
> > Yup, it would have to do extensive pattern matching otherwise.  C/C++
> > compilers do the same thing (I.e. have intimate knowledge of stdlib calls
> > and may optimize more aggressively or replace code with intrinsic
> > altogether).
> >
> > In this case, jit uses the bsf x86 assembly instruction whereas hand
> rolled
> > "copy version" generates asm pretty much matching the java code.
> >
> > Sent from my phone
> >
> > On Sep 28, 2012 2:42 PM, "Raffaello Giulietti"
> > <raffaello.giulie...@gmail.com> wrote:
> >>
> >> On Fri, Sep 28, 2012 at 8:15 PM, Charles Oliver Nutter
> >> <head...@headius.com> wrote:
> >> > On Fri, Sep 28, 2012 at 10:21 AM, Raffaello Giulietti
> >> > <raffaello.giulie...@gmail.com> wrote:
> >> >> I'm not sure that we are speaking about the same thing.
> >> >>
> >> >> The Java source code of numberOfTrailingZeros() is exactly the same
> in
> >> >> Integer as it is in MyInteger. But, as far as I understand, what
> >> >> really runs on the metal upon invocation of the Integer method is not
> >> >> JITted code but something else that probably makes use of CPU
> specific
> >> >> instructions. This code is built directly into the JVM and need not
> >> >> bear any resemblance with the code that would have been produced by
> >> >> JITting the bytecode.
> >> >
> >> > Regardless of whether the method is implemented in Java or not, the
> >> > JVM "knows" native/intrinsic/optimized versions of many java.lang core
> >> > methods. numberOfTrailingZeros is one such method.
> >> >
> >> > Here, the JVM is using its intrinsified version rather than the JITed
> >> > version, presumably because the intrinsified version is pre-optimized
> >> > and faster than what the JVM JIT can do for the JVM bytecode version.
> >> >
> >> > system ~/projects/jruby-ruby $ java -XX:+PrintCompilation
> >> > -XX:+UnlockDiagnosticVMOptions -XX:+PrintInlining Blah
> >> >      65    1             java.lang.String::hashCode (55 bytes)
> >> >      78    2             Blah::doIt (5 bytes)
> >> >      78    3             java.lang.Integer::numberOfTrailingZeros (79
> >> > bytes)
> >> >                             @ 1
> >> > java.lang.Integer::numberOfTrailingZeros (79 bytes)   (intrinsic)
> >> >      79    1 %           Blah::main @ 2 (29 bytes)
> >> >                             @ 9   Blah::doIt (5 bytes)   inline (hot)
> >> >                               @ 1
> >> > java.lang.Integer::numberOfTrailingZeros (79 bytes)   (intrinsic)
> >> >                             @ 15   Blah::doIt (5 bytes)   inline (hot)
> >> >                               @ 1
> >> > java.lang.Integer::numberOfTrailingZeros (79 bytes)   (intrinsic)
> >> >
> >> > system ~/projects/jruby-ruby $ cat Blah.java
> >> > public class Blah {
> >> > public static int value = 0;
> >> > public static void main(String[] args) {
> >> >   for (int i = 0; i < 10_000_000; i++) {
> >> >     value = doIt(i) + doIt(i * 2);
> >> >   }
> >> > }
> >> >
> >> > public static int doIt(int i) {
> >> >   return Integer.numberOfTrailingZeros(i);
> >> > }
> >> > }
> >> > _______________________________________________
> >>
> >>
> >> Yes, this is what Vitaly stated and what happens behind the curtains.
> >>
> >> In the end, this means there are no chances for the rest of us to
> >> implement better Java code as a replacement for the intrinsified
> >> methods.
> >>
> >> For example, the following variant is about 2.5 times *faster*,
> >> averaged over all integers, than the JITted original method, the one
> >> copied verbatim! (Besides, everybody would agree that it is more
> >> readable, I hope.)
> >>
> >> But since the Integer version is intrinsified, it still runs about 2
> >> times slower than that (mysterious) code.
> >>
> >>     public static int numberOfTrailingZeros(int i) {
> >>         int n = 0;
> >>         for (; n < 32 && (i & 1 << n) == 0; ++n);
> >>         return n;
> >>     }
> >> _______________________________________________
> >> mlvm-dev mailing list
> >> mlvm-dev@openjdk.java.net
> >> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
> >
> >
> > _______________________________________________
> > mlvm-dev mailing list
> > mlvm-dev@openjdk.java.net
> > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
> >
> _______________________________________________
> mlvm-dev mailing list
> mlvm-dev@openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
>
_______________________________________________
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev

Reply via email to