Re: [arch] Interpreter vs. JIT for Harmony VM

Frederick C Druseikis Wed, 21 Sep 2005 20:38:58 -0700

Hello list,
Long time reader, first time writer.

On Wed, 21 Sep 2005 14:30:10 -0300
Rodrigo Kumpera <[EMAIL PROTECTED]> wrote:

> Having a mixed JITed-interpreted enviroment makes things harder.
> Writing a baseline, single pass JITer is easy, but there are A LOT
> more stuff to make a port that just the code execution part.

Agreed.
I'd like to amplify on that point about the more stuff.  I think my conclusion 
is
that if you try to live without it you'll end up creating the guts of the 
interpreter anyway.

I've been looking at how a fast JIT might work with
a small interpreter, like jamvm.  The JIT I had in mind was the (now ancient) 
TYA.
Its code generator is very small and relatively isolated from all of the junk
that early Java 2 interpreters imposed.

Issues I see:

(1) jni.h -- There is a lot more functionality than GC.  Like:
Reflection, like class loading, like resolving field and method names; like 
resolving 
class names; like getting the version number ;) its a long list.  The TYA 
recoder (java byte-code to x86 binary) calls resolve methods in the JNI.

(2) There is nothing to JIT to get started!  This code would be a level 
underneath
classpath.  So I fall off the wagon wondering how the JIT-only system even gets 
off the ground?
It's not *just* a code generator.  A jvm brings something to the party.

The OpenJIT project made claims about bootstrapping, but
it seems to be only of archival interest now.  (Althought all it's JIT was 
written in Java with
some unclear dependency on the ca. Java 1.2 implementations.  And it has an 
unfriendly License.)

(3) JamVM, and others as well it seems, like to play tricks with the bytecodes, 
often changing them after the first time through a method to keep from having 
to do more expensive lookups. This
is really an acknowldegement that the class files with their bytecodes are 
lacking a lot
of essential binding information, which would all have to be discovered by a 
JIT.

(4) The point is that the best information about what to  JIT is what has
been measured -- something like frequency counts after N interpreter calls.
Who cares if there is some code that is only executed once.
public static void main() is not where to start JITing.  That's whay we need an 
interpreter.
There is so much bytecode not even worth JITing :)

(5) The better optimizations are the ones that eliminate interpreter function 
calls. TYA did a tiny amount of constant folding and inlining.  Our current 
design methodologies emphasize Getter/Setter type beans conventions that a lot 
of current code is ripe for very simple source level optimization.  This is 
where the low-hanging fruit is.  And it potentially
benefits an interpreter as well as a JITer.  So much for AOP on arbitrary 
getters :)

My sense is that a small interpreter with a pluggable JIT is a pragmatic 
approach.
Which means to me that the central question is one of what is the interface 
relationship
between the Interpreter and the JIT? What part of the interpreter makes the 
decision about what methods should be JITed?  I see the interpreter collecting 
data, some pluggable interface
making policy decisions about what to JIT, and one of them calling the JITer 
when it's time.

-Fred Druseikis

> 
> JikesRVM have a base JITer class that does the bytecode decoding and
> one subclass per platform that does the code generation. Porting is a
> matter of creating another subclass and implement a few methods (a LOT
> less than the 198 opcodes from the java bytecode).
> 
> The hard part is not the code of the JITer subclass, but writing an
> assembler for it. Take x86, for example, generating all those
> addressing modes is pure and simple PITA. And there are other code
> artifacts that need to be generated, like call traps and interface
> dispatch functions.
> 
> Then we have some platform specific issues, like exception handling,
> stack walking, scanning and unwinding, native calls, NPE traps and
> many more that I´m missing here.
> 
> The GC code should suffer from some plaform issues due to read/write
> barriers, card marking methods, how to park all threads, etc.
> 
> And this is only just to port from one hardware platform to another.
> OS porting is another big source of problems.
> 
> All in all, I think that using a JITed only enviroment is easier.
> 
> 
> On 9/21/05, Peter Edworthy <[EMAIL PROTECTED]> wrote:
> > Hello,
> >
> > > Do we need an interpreter or a fast code generating, zero optimizing JIT?
> >
> > I'd vote with a zero optimizing JIT. My reasons are not so much based on
> > speed, but on code reuse. The structures required to support this would
> > also be used by optimizing JITs. In an interpreter and JIT system the two
> > tend not to overlap as nicely. Less code & concepts to understand =
> > better, IMHO.
> >
> > > I can think of advantages to each approach. Assuming a C/C++
> > > implementation, a traditional interpreter is easy to port to a new
> > > architecture and we can populate Execution Engines (EE) for different
> > > platforms rather easily.
> >
> > (This is at the edge of my knowledge, I've read about it but never tried it)
> > If the JIT is a bottom up pattern matching compiler, which seems to fit
> > well with the Java Byte Code format, then populating the 'pattern tables'
> > especially if not aiming for much if any optimization would be just as
> > easy as setting up EEs.
> >
> > > On the other hand, a fast code-generating JIT can call runtime helpers and
> > > native methods without additional glue code whereas an interpreter has to
> > > have special glue code to make it work in a JIT environment. Needless to
> > > say, if a method is called more than once, the one time cost of JITing
> > > without optimization may be lower than the cost of running the interpreter
> > > loop.
> >
> > For Magnus an example to explain the above.
> >
> > a = sin (b) compiles to something like
> >
> > load b
> > push
> > call sin
> > pop
> > mov a
> >
> > If the sin function is native then for an interpreter the process would be
> >
> > read load b; push
> > carry out equivalent operation
> > read call sin
> > Find sin method is native
> > call native call routine
> >   call sin
> >   return sin return
> > read pop; mov a
> > carry out equivalent operation
> >
> > so the call to the sin function is actually two jumps, which is bad as
> > jumps take time and often invalidate the data cache in the processor.
> >
> > If compiled then it would be
> >
> > load b; push; call sin; pop; move to a
> >
> > There is only one call to get to the sin method
> >
> > > Our experience is that a fast, zero optimizing JIT can yield low-enough
> > > response time. So, I think at least Harmony has the option of having a
> > > decent system without an interpreter. Thoughts?
> > Again less is more ;-}>
> >
> > Thanks,
> > Peter Edworthy
> >
> >

Re: [arch] Interpreter vs. JIT for Harmony VM

Reply via email to