Hi Maciej, I am a PhD student at the Australian National University. I am a colleague of John Zhang and the chief designer of the Mu project, a micro virtual machine. (http://microvm.org) I can introduce this project to this mailing list.
TL;DR: Implementing a managed language is hard. Existing VMs are either too big or too monolithic. Mu, the micro VM, only handles concurrency, JIT and GC, which an average language designer probably don't want to (and is unlikely to have the expertise to) work with. Then a "client" program can use Mu to implement its language. Our group at the ANU coined the concept "micro virtual machines" or "micro VM", which is a parody the concept of "micro kernel" in the OS literature. Mu is a concrete micro VM (the same way "seL4 is a concrete microkernel". Mu was previously called MicroVM or µVM, but we changed the name to distinguish between "our particular micro VM" and "the general concept of micro virtual machines". Our motivation: Many managed programming language implementations suck, such as CPython which uses GIL, naive reference counting GC and no JIT. People may want to create alternative implementations. Currently they either base on another (macro) VM or build a new VM from scratch. Existing VMs (like JVM, CLR, etc.) usually provide too much abstraction. They cause semantic gaps and a lot of unnecessary dependencies, while still do not work well with languages they are not designed for (See how PyPy outperforms Jython). Others (including PyPy) build a whole new VM, doing everything from scratch. There are many "high-performance VM" projects like PyPy (LuaJIT, v8, JavaScriptCore, HHVM to name a few), but the most important low-level parts (JIT, GC, ...) are not reused. We proposed a third option. A "micro virtual machine" is minimal, providing only the abstractions of concurrency, execution (JIT) and garbage collection, which we identify as the three major concerns that make language implementation hard. Another program which we call a "client" sits above the micro VM and implements the concrete language. Here are some facts about Mu, our concrete micro VM: 1. It has a specification which defines the behaviour, and multiple implementations are allowed. We also have a reference implementation. 2. The architecture has two layers: a micro VM at the bottom, and a "client" on top of it. The client interacts with the micro VM through an API. 3. Its type system is similar with the level of C, but with object references. The level is similar to the LL type system in PyPy. It has fixed integers, FP numbers, references, structs, arrays, ..., but no Java-like object hierarchy. (The micro VM is minimal and does not know OOP. The client implements its own types using structs, arrays, ...) 4. Its instruction set is LLVM-like, using the SSA form. But it is designed for managed languages. 5. Heap memory allocation is a primitive instruction. Heap memory is garbage-collected. 6. Mu has a garbage collector. All references in Mu can be identified, whether they are in the heap, stack, global variables or held for the client (in which case the reference is exposed as opaque handles, like JNI). With all references identified, Mu can perform precise (accurate, exact) garbage collection. 7. Mu has threads which can run simultaneously (supposed to be implemented as OS threads, but the spec does not force it). Mu also has a C++11-like memory model. (Yes. Mu has RELAXED, CONSUME, ACQUIRE, RELEASE, ACQ_REL, SEQ_CST to scare your children.) 8. Mu's code loading unit is "bundle" (like the ".class" file in Java). The format is called "Mu IR" (like LLVM IR) which contain codes and top-level definitions. The client can define new codes at run time. (You can JIT-compile high-level programs to Mu IR at run time.) 9. Mu has the "TRAP" instruction which pauses the execution of a Mu IR program and execute a "handler" in the client. The "handler" can introspect the states of the execution (including local variables) and perform OSR (on-stack replacement, removing a stack frame and pushing a new frame of a probably newly-compiled function). After the handler, the client decide where the thread should continue. 10. Mu has a simple exception handling mechanism. It does not rely on system libraries. John Zhang is working on making RPython a front-end language of Mu (in other words, making Mu a back-end of RPython). We think the LL type system is roughly at the same level as Mu, but since Mu already has reference types, heap allocation instruction and an internal garbage collector, RPython no longer need to insert low-level implementation details for GC (like read/write barriers, GC-safe points, stack maps and so on). However, Mu does not expose the memory directly as bytes. For this reason, some implementation strategies are no longer applicable in Mu. For example, array copy must be done element by element, and memcpy cannot be used. Regards, Kunshan Wang On 18/03/2015 7:57 pm, Maciej Fijalkowski wrote: > Hi John. > > Can you describe the microVM and it's capabilities? Chances are it > captures things at the wrong level (I have a longer response in mind, > but I'll wait for you to describe it, in case I'm plain wrong) > > What do you mean by "provides a GC"? Does it mean you just call malloc > and you never have to call free? > > Generally speaking we don't suggest you translate pypy as a first > step, but instead write tests (equivalent to what's in > translator/c/test) and check aspects of translation one bit at a time. > That said, dependency on rweakref even when disabled is a bug, can you > post a full traceback? > > Cheers, > fijal > > > > > > On Wed, Mar 18, 2015 at 2:01 AM, John Zhang <u5157...@uds.anu.edu.au> wrote: >> Hi all, >> I'm working on developing a MicroVM backend for PyPy. It's a virtual >> machine under active research and development by my colleagues in ANU. It >> aims to capture GC, threading and JIT in the virtual machine, and frees up >> the burden of the language implementers. >> >> Since MicroVM provides GC, I need to remove GC from the PyPy >> interpreter. As I was trying to compile it with the following command: >> pypy $PYPY/rpython/bin/rpython \ >> -O0 \ >> --gc=none \ >> --no-translation-rweakref \ >> --annotate \ >> --rtype \ >> --translation-backendopt-none \ >> $PYPY/pypy/goal/targetpypystandalone.py >> It gives off an error during annotation stage, saying that it's not able >> to find a module called '_rweakref'. >> Does anyone know what the problem might be, and how one might go and >> solve it? >> >> Appreciate greatly, >> John Zhang >> _______________________________________________ >> pypy-dev mailing list >> pypy-dev@python.org >> https://mail.python.org/mailman/listinfo/pypy-dev > _______________________________________________ > pypy-dev mailing list > pypy-dev@python.org > https://mail.python.org/mailman/listinfo/pypy-dev >
signature.asc
Description: OpenPGP digital signature
_______________________________________________ pypy-dev mailing list pypy-dev@python.org https://mail.python.org/mailman/listinfo/pypy-dev