Heywood Floyd <soul...@gmail.com> wrote:
Thanks for the answer!
Ok, hm, how about this then:
auto opTable = [&op_add, &op_cmp, &op_end]; //etc.
ubyte[] eval(ubyte[] prog)
{
int ip = 0, sp = 0;
ubyte[4096] stack;
next:
goto opTable[prog[ip++] & OP_TABLE_MASK];
op_add:
stack[sp] += stack[sp-1];
goto next;
op_cmp:
stack[sp] = stack[sp-1] == stack[sp-2];
goto next;
/// and so on...
op_end:
return stack;
}
enum opTable : int {
op_add,
op_cmp,
op_end,
// etc
}
ubyte[] eval(ubyte[] prog) pure {
int ip = 0, sp = 0;
ubyte[4096] stack;
while ( true ) {
final switch ( cast( opTable )( prog[ip++] & OP_TABLE_MASK ) ) {
case op_add:
stack[sp] += stack[sp-1];
continue;
case op_cmp:
// blahblahblah
continue;
// ???
case op_end: // Profit!
return stack;
}
}
}
What I'm looking for here is a way of interpreting code without creating
branches in the machine code, unless the interpreted code actually does
a branch (ie a conditional jump). Seems to me a switch would introduce
branching (?) of some sort.
Seems to me a goto would introduce a branch, so I'm not sure doing if
your way actually causes less branching.
I mean, even if switch is implemented as a jump table, it would still do
some basic bounds checking, or?
Final switch to the rescue!
http://digitalmars.com/d/2.0/statement.html#FinalSwitchStatement
Essentially, mark the switch as final, and cover every option.
Likely, the optimizer does that for you if you cover every option but
don't mark the switch as final.
I'm also interested in trying to inline the "next"-operation here, ie
like
string op_next(){ return "goto opTable[prog[ip++] & OP_TABLE_MASK];"; }
//...
op_add:
stack[sp] += stack[sp-1];
mixin(op_next());
..in order to reduce a jump. Of course I'm just playing around with
different strategies for creating a fast interepreter. In C, at least,
using a jump table instead of a switch is faster, especially in 32-bit
mode (according to some very simple experiments, which may or may not
hold water in reality™).
Now this, this would not work with a simple switch, no.
Any ideas for how to make a jump-table interpreter in D? Is it doable
with inline asm perhaps? If at least not for any other reason than to
investigate if it's faster. (Or is it a stupid idea to begin with? Is
this overkill? : )
It likely is overkill, but I'm no expert in these matters. My forays
into the inline asm idea proved fruitless, but there may yet be ways.
PS. How do you export assembler code from the DMD-compiler?
Not sure what you mean here. Do you want an assembler-listing as the
output of the compiler? If so, I don't think there's a way except for
obj2asm or similar disassemblers.
--
Simen