Re: Terminology etc
Hi, on the subject of terminology, when you say scheduling, what do you mean exactly? Are you talking about monitors and other sync issues or are you talking about some kind of m:n thread scheduling? Or neither? :) Cheers, Dave On 5/24/05, Steve Blackburn [EMAIL PROTECTED] wrote: Dmitry Serebrennikov wrote: A quick proposal: perhaps VM bootstrap as used below should really be something like VM initialization, VM init, or even VM startup, since the word bootstrap is very specific and to me at least, indicates something more akin to the VM bootloader + VM boot image (as used below). That sounds fine with me. It is hard to move outside one's own (limited) terminology :-). We have boot() methods on each of the key components and a boot() method in the vm core which calls these. That's the origins of my terminology. Init seems reasonable and disambiguates the role of the bootloader. Probably the original rationale for using the boot word is that there is a higher level of bootstrapping going on here---getting the classloader loaded before you have a class loader ;-) etc etc. But I agree, init may be helpful. * OS interface is perhaps one place where some code can be shared. If C version can benefit from an OS abstraction layer for portability, then it seems like this layer could be shared between the C and the Java implementations. I agree. It turns out to be a tiny amount of code though (in the case of Jikes RVM, at least). * The meat of the VM seems to be in the spokes that connect to the VM core-hub. It seems that this is where it would make the most sense to mix components written in C with those written in Java, to see which one can do a better job. If all spokes were in C, it would make little sense to have the hub be in Java... On the other hand if spokes are all Java, it makes little sense to have the hub be in C. Steve, if the spokes were in Java but the hub in C, would we then lose all of the aggressive inlining benefits that Java-in-Java solution can provide? I don't know. These are really interesting questions and ones I think we need to hear lots of informed opinion on. My experience with mixing C Java inside the VM is limited to the interface to the MM. The inlining issue is key there because of the fine grain at which those interactions occur. Scheduling, compilation and classloading are very much coarser grained, so those issues are much less critical there... Cheers, --Steve
Java bytecode metrics?
I've been wondering about the cost of adding gc points to every backwards branch in LLVM (as would need to be done to make it multithreaded). A paper here http://research.sun.com/techrep/1998/abstract-70.html suggests the cost is around 5% of total running time (compared to code patching). What I was wondering was, does anyone know of any code metrics for large collections of Java bytecode? Eg what is the average number of bytecodes in a method, what is the average ratio of backward branches per bytecode and so on. Is code patching a technique that has been widely employed in JIT compilers? (I liked the idea of using an access to a write-protected page to reduce the cost of polling!) Cheers, Dave
Re: JIRA and SVN
On 20 May 2005 17:54:11 -0600, Tom Tromey [EMAIL PROTECTED] wrote: This is too vague -- we don't know much about the unexpected. Plus, in most cases, the core part of the VM is simply not very important. There just isn't much code there -- JamVM is 20KLOC, anybody could comfortably rewrite this. Hmmm, well I used to work with the author of JamVM (Rob Lougher) and he's one of the brightest guys I know. I think you'll find that the low LOC figure is testament to his ability to write lean code rather than an indication of how easy it is to knock off a JVM on a wet Sunday afternoon. BTW has anyone asked Rob about donating JamVM to Harmony? As the (currently) sole owner he should have no problem with switching licenses. Cheers, Dave
Re: Against using Java to implement Java (Was: Java)
I think it's too slow to have the overhead of a function call for every object allocation. This is the cost of modularization. I doubt any of the mainstream JVMs you are competing with do this. Cheers, Dave On 17 May 2005 18:27:42 -0600, Tom Tromey [EMAIL PROTECTED] wrote: David == David Griffiths [EMAIL PROTECTED] writes: David Maybe a concrete example would help. Let's say you have a GC module David written in C. One of it's API calls is to allocate a new object. How David is your JIT module going to produce code to use that API? Via a C David function pointer? Yes. One way is to mandate link- or compile-time pluggability only. Then this can be done by name. Your JIT just references 'harmony_allocate_object' in its source and uses this pointer in the code it generates. The other way is to have the JIT call some central function to get a pointer to the allocator function (or functions, in libgcj it turned out to be useful to have several). This only needs to be done once, at startup. For folks interested in pluggability, I advise downloading a copy of ORP and reading through it. ORP already solved these problems in a fairly reasonable way. Tom
Re: Against using Java to implement Java
Well that's the theory but I think you'll find in practice that real JITs cheat and inline object allocation using their knowledge of the GC internals. And there is no way that a JIT is going to implement synchronized methods by doing a monitorEnter function call! Dave On 5/18/05, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: There is an old document describing a JIT interface though ORP should be more advanced, for example, as having GC interface. The JIT Compiler Interface Specification http://java.sun.com/docs/jit_interface.html Sun's Classic VM, which was a reference VM, of JDK 1.0.2 and 1.1.X implements this interface and it was modified a bit for J2SDK 1.2. There were actually multiple JIT compilers based on this JIT interface including Symantec JIT, OpenJIT, shuJIT and TYA. This interface is not enough to support advanced optimizations including adaptive compilation, which today's Sun's and IBM's runtimes do. Adaptive compilation needs cooperation by an interpreter (or a baseline compiler) and I am not sure whether it can be factored out from the JVM core. From: Tom Tromey [EMAIL PROTECTED] David Maybe a concrete example would help. Let's say you have a GC module David written in C. One of it's API calls is to allocate a new object. How David is your JIT module going to produce code to use that API? Via a C David function pointer? Yes. One way is to mandate link- or compile-time pluggability only. Then this can be done by name. Your JIT just references 'harmony_allocate_object' in its source and uses this pointer in the code it generates. The other way is to have the JIT call some central function to get a pointer to the allocator function (or functions, in libgcj it turned out to be useful to have several). This only needs to be done once, at startup. For folks interested in pluggability, I advise downloading a copy of ORP and reading through it. ORP already solved these problems in a fairly reasonable way. Kazuyuki Shudo[EMAIL PROTECTED] http://www.shudo.net/
Re: Against using Java to implement Java (Was: Java)
I know, but despite the subject line my original point was about the problem of modularizing a VM written in C. Cheers, Dave On 5/18/05, Steve Blackburn [EMAIL PROTECTED] wrote: This subject has been covered in detail at least twice already. There is no need for any function call on the fast path of the allocation sequence. In a Java in Java VM the allocation sequence is inlined into the user code. This has considerable advantages over a few lines of assembler. Aside from the obvious advantage of not having to express your allocator in assembler, using Java also compiles to better code since the code can be optimized in context (register allocation, constant folding etc), producing much better code than hand crafted assembler. However this is small fry compared to the importance of compiling write barriers correctly (barriers are used by most high performance GCs and are executed far more frequently than allocations). The same argument applies though. The barrier expressed in Java is inlined insitu, and the optimizing compiler is able to optimize in context. Modularization does not imply any of this. --Steve Weldon Washburn wrote: On 5/18/05, David Griffiths [EMAIL PROTECTED] wrote: I think it's too slow to have the overhead of a function call for every object allocation. This is the cost of modularization. I doubt any of the mainstream JVMs you are competing with do this. Yes. I agree. A clean interface would have a function call for every object allocation. However if the allocation code itself is only a few lines of assemby, it can be inlined by the JIT. Using a moving GC, it is possible to get the allocation sequence down to a bump the pointer and a compare to see if you have run off the end of the allocation area.
Third Way - implement the JVM in a new C/Java hybrid
No, but seriously. :) I agree that C sucks as a JVM implementation language for lots of reasons: the Java/C impedance mismatch, the lack of ability to aggressively inline (which will be a big issue if you every try to achieve real modularity), poor trace/debug/profiling ability (wouldn't it be nice to have a coherent view of where your time was being spent?). The idea of implementing it in Java sounds nice except for all the tricks you have to do to get round the infinite recursion problem. So why not invent a new language that is a kind of half way house between C and Java? It will be dynamically compiled just like Java but won't come with a lot of the other stuff like GC and threading. Instead it will have raw access to the platform system calls. Plus some kind of support for C style memory pointers. The same compilation engine would compile both the JVM parts written in this language plus suitable converted Java bytecode. I don't think inventing the language and writing a parser for it is the hard part (it's pretty much Java-- with a dash of C), the hard part is doing the optimising back-ends for all the different platforms. But you've got to do that anyway unless you go the GCJ route. If it's possible to write a JVM in Java (as has been demonstrated) then it's surely even easier to implement it in a purpose built language. Such a language might have other system programming type uses. Lots of existing JNI code could be converted to it. Well that's just another thought in case you were in danger of reaching a consensus. :) Cheers, Dave
Re: Third Way - implement the JVM in a new C/Java hybrid
I thought GCJ was a static compilation system? What I was thinking of was fully dynamic JIT-style compilation. A lot of the problems with using C as the implementation language stem from it's statically compiled nature. Not to mention the craziness of having platform-specific code generation and optimisation duplicated in both the C compiler and the Java JIT. (Which admittedly GCJ avoids). Cheers, Dave On 5/13/05, Bob [EMAIL PROTECTED] wrote: So why not invent a new language that is a kind of half way house between C and Java? I think that GCJ gives you this third way already. And it comes with a GC, which once explicitly managed, could be used as the Harmony GC as well. (GCJ's GC has an older pedigree, I believe).