(Having missed last night's dinner and all the fun there...) When Rémi started talking about this interpreter today, I thought it'd be tailcall related.
In college, my introductory course to computer systems was based on the book "Introduction to Computing Systems: From Bits & Gates to C & Beyond", where it introduces a small educational 16-bit architecture called LC-2. We were given a LC-2 simulator to run our "assembly" programs. Later on, when I've learned enough Java to be dangerous, I wanted to write a more powerful simulator (interpreter) of the LC-2 ISA in Java. My first version was a most straightforward switch-threaded one. Nothing much to be said about that. The next version, wanting to mimic a indirect-threaded style, was structured in very much the same way Rémi presented in the link, but MethodHandle was still years from available so I didn't have the luxury of using it. Instead, I modeled the opcode handlers as "single-method objects", and do a virtual call at the end of each handler. (So, instead of Rémi's MH calls, I'm just using a table lookup + virtual call) public class NopHandler extends AbstractHandler // or implements Handler { public void execute(State s) { HANDLERS[s.code[s.pc++]].execute(s); // dispatch } } That's didn't work out quite well. The performance was okay-ish with small program inputs. But HotSpot's JIT compilers doesn't implement tail call optimizations (well, recursive inlining sort of counts...), so with a long-enough program the stack blows up. And it didn't take that long of a LC-2 assembler program to make it blow up. I realized that I had to have proper tail calls to make it work. So I added a central dispatch loop to serve as a trampoline, e.g. Handler next = getNextHandler(); while (next != null) { next = next.execute(s); } where my Handler.execute() would still perform the lookup, but not the actual invocation. As expected, the performance was no where as good as directly making the virtual call at the tail position in each handler. This central dispatch loop makes the unpredictable-ness of the Handler.execute() significant. Need a good trampoline. Need proper tail calls. - Kris On Wed, Aug 3, 2016 at 3:14 PM, John Rose <john.r.r...@oracle.com> wrote: > On Aug 3, 2016, at 1:22 PM, Remi Forax <fo...@univ-mlv.fr> wrote: > > > > Charles ask if we can make a fast register interpreter in Java, > > here is my take on a threaded-like interpreter > > > > https://gist.github.com/forax/f38b533e089217cfc4d0ae3c6e2de9c9 > > Nicely done. We were talking last night at dinner about making > bytecode interpreters go fast, and this is helpful. > > I think there may be a combinator here, for bytecode dispatch loops. > (This feels like the PIC combinator, both fundamental and tricky to get > right.) > A "bytecode" dispatch combinator might provide a pattern like the > following: > > MethodHandle init, pred, step, fini, index; > @NonNull MethodHandle[] tab; > R dispatch(A… a) { > V v = init(a…); // one here; there might be many v… > while (pred(a…)) { > v = step(v, a…); > // execute one "instruction": > int i = index(v, a…); > tab[i].invokeExact(v, a…); > } > return fini(V, a…); > } > > The explicit table would be recopied internally to an @Stable array for > constant folding. > The various index values could be tracked and turned into traces (or some > other profile > driven structure) which would be compiled together, in specialized > segments of the unrolled > interpreter loop. > > Or maybe just the table lookup part alone makes a good combinator, to be > paired > with the loop combinators? > > Comments welcome… > > — John > > P.S. Another new RFE: > experimental hook for creating heisenboxes > https://bugs.openjdk.java.net/browse/JDK-8163133 > _______________________________________________ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev >
_______________________________________________ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev