On 6/26/05, Rafal Lewczuk <[EMAIL PROTECTED]> wrote:
> Hi,
> 
> Newbie's thoughts below.
> 
> On 6/25/05, Ahmed Saad <[EMAIL PROTECTED]> wrote:
> > 2. which current VM implemention would we start refining as a core for
> > Harmony? (or we would write it from scratch)
> 
> Newbie's random thought: start with some simple-as-hell implementation
> (JamVM may be a good candidate) and refactor it into a modular one
> (kind of 'stretching' it onto a 'framework' set of interfaces,
> extracting GC, execution engine, class loader etc. one by one).
> 

After my initial posting to this list I've gone very much back into
lurker mode.  Just as I'm not into promoting JamVM, I'm not much into
defending it either (though some people may disagree with this).  You
can take it or leave it.

However, I'm probably one of the few people who has written a
non-trivial VM from scratch, and when I started I already was an
experienced VM engineer.  So my thoughts may be useful/interesting or
annoying.

First of all, just because JamVM is small does not mean it is trivial.
 As I was interested in targetting embedded platforms, I put in a
large amount of design effort _from the start_ to minimise code size
and runtime memory usage.  As in many other situations, smallness can
come from triviality or from careful design.  Of course, many parts of
JamVM are simplistic, but if code size == quality we'd all be using
Microsoft Windows.  It is the last trap I would have thought
open-source people would fall into.  The continued assumption that
JamVM is small == simple has begun to affect my coding style, but this
is unfair to users on embedded platforms.

The interpreter in particular I like to think of as
"state-of-the-art".  It is certainly not trivial, and it is more
optimised than most commercial VM interpreters (in many tests it is 2x
faster than HotSpots' interpreter under Mac OS X).  This in itself has
taken many months of work, and I have substantially rewritten it
twice, so it is two iterations beyond my first interpreter, which also
included several advanced techniques.  It now does direct-threading,
static and dynamic stack-caching, prefetching and makes use of
super-instructions.

One of the advantages of being a "one man team" is that you know the
code intimately.  While this can lead to spaghetti-like code with many
inter-module dependencies if you're not careful, it can also lead to
compact code, as you're not afraid to re-factor modules and their
interfaces when the time is right.  Having modules written by separate
teams can result in in-efficiency and code duplication, as each module
implements its own utilities, e.g. hash tables, lists, etc., or ends
up marshalling arguments for an inappropriate interface.  Trying to
guess every need "up front" in a neat module/interface definition is
doomed to failure.  I believe it is better to start off with a minimal
interface, and then re-factor as experience dictates.  Of course,
there are some very experienced VM implementors on this list, and
several module definitions already, but I like to factor through
experience not anticipation.
 
For the record, I believe JamVM to be fairly well "modularised", each
distinct component is in a separate file, with a defined interface. 
There is very little duplicate code, and no private utility
implementations.  The biggest problem is that as yet, I have no
abstraction for stack-walking.  Please note, I'm not putting JamVM up
as an example of a module definition.  I'm sure there are many, many
problems if you were to look at it in detail towards that end.

Regards,

Rob.

----

Robert Lougher (Dr.)


> Upsides:
> - it should be easy for newcomers to get in;
> - while designing, there is still a working implementation, hopefully
> passing many of Mauve tests;
> - having many pieces in place at start;
> - JVM simplicity causes design work to be actually easier (than  a
> bigger one) by causing refactoring less painful; (albeit harder than
> designing and implementing from scratch);
> Downsides:
> - minimal JVM usually is compact and its compactness causes code to be
> very interlinked in many places, so module extraction can be sometimes
> irritating and a bit painful;
> - dealing with legacy code causes some extra work (and bugs resulting
> in misunderstanding legacy code, but working thing along with some
> good set of tests makes it actuallly easier to squash bugs early);
> - assumptions about object/class/thread/stack/isolate structure layout
> are almost surely to be changed and may be hardwired in many places in
> the legacy code; this is the harder part of refactoring work;
> - minimal JVM design may not have some issues in mind (for example,
> JIT, class verifier etc.);
> 
> Some other general comments (some are truisms, I know):
> 
> - it is important to distinct between innovation/research and good
> engineering. While designing framework I'd suggest to use proven
> solutions rather than great innovations; after all, a
> reliable/production VM has to be released; well engineered framework
> shall make research easier after all; that is why I'm rather a
> C-camper and I'd suggest to make C-based framework capable to interact
> with java modules (having read some earlier posts, this is propably
> truism); make sure to make it efficient as some operations (write
> barriers, for example) are critical when it comes to performance;
> - avoid prematurely sacrificing design for the sake of performance;
> profile, locate and remove performance problems; personally, I do not
> believe the first release will be faster (or even comparable) with
> Sun's implementation (although Sun VM isn't the fastest one in the
> world); and make sure that we won't wait forever to get in par with
> Sun VM without releasing anything;
> - Java-C/C-Java (internal) interface doesn't necessarily have to be
> slow; look at GCJ and GCJX ( http://sourceforge.net/projects/gcjx )
> and JC (although JC manually loads compiled object and thus does not
> use system linker features, like effective code sharing between VM
> processes); I think it is possible to generate C-callable shared
> libraries from Java code (along with some extra segments for class
> metadata etc.); Unix ELF and Windows PE formats are pretty extensible
> and we may use a plenty of techniques here (autogenerating dedicated
> low-overhead stubs for C, having class metadata embedded in the
> library etc.); a plenty of compiling/linking/symbol tricks and
> techniques are available here;
> - there are many project we can borrow framework ideas from: ORP for
> example has some good points about GC and JIT interface; Kaffe uses
> COM-like interfaces (GC) etc.
> - having some kind of 'frankenVM' consisting of various pieces doesn't
> have to be inherently bad; did Mono emerge this way or am I wrong ?
> - someone has to build a 'big picture' and split it into parts to make
> people working on details (don't start with one or two interfaces,
> start with a big picture first: several main modules, how shall they
> interact with other modules, where are potential problems); that HAS
> to be done by a person with extensive experience in VM construction
> (engineer rather than researcher); newbies like myself fall short in
> this mainly because of not having dealt with details;
> - sorry for being a bit offensive on researchers ;) it wasn't my
> intention, I just think that we need to have a proven set of things
> first, then may do some good research on it;
> 
> Regards,
> rle
>

Reply via email to