Archie Cobbs wrote:
Robin Garner wrote:
Actually my colleagues at ANU and I were remarking last week that all
the recent discussion on the Harmony list (configure scripts, packed
structs etc etc) were close to being proof that Java was the easier
way to go.
Here's some idle speculating about writing a JVM in Java...
Start by asking this question: if you could design a new language
expressly for the purpose of implementing JVM's, what would that
language look like?
Java is almost the right language.. but not quite. You need to
be able to do C-like stuff as well.
One can imagine something that is mostly like Java, but has some
additional features that allows C like functionality, for example:
- Augment the Java type system with C-like "structs". These are
like Java objects in that they can be allocated on the Java heap
(as an option) but have no Object header (you can't synchronize
on them directly and they have no associated Class). Then the
in-memory representation of an Object is a special case of one
of these structures, containing a lockword and vtable pointer.
- Define a new "word" primitive type that corresponds to the
machine-specific word size (i.e., 32 or 64 bit unsigned int).
Corresponds to SableVM's _svm_word and JC's _jc_word.
- Language would include primitives for compare-and-swap of a word,
memory barriers, etc.
- The language would include the ability to cast between any types
as you can do in C (e.g., struct -> Object, word -> Object pointer).
- Allow C function calls to be expressed in the language, passing
as parameters any Java type, or a struct. This "compiles" directly
into a C function call using the platform's normal C calling
conventions.
- Extend the class file format in a corresponding manner.
Call this language "Java++" (or whatever). Then the 95% of the JVM
can be written in this language.. and 95% of that would be normal Java.
This is exactly how we see the dialect of Java that MMTk is written in.
The non-java extensions are the org.vmmagic classes. The key difference
is that our types are represented as 'unboxed' objects, which gives us
more flexibility to define operations on them.
cheers