On 25-Jul-12 03:31, Walter Bright wrote:
On 7/24/2012 3:33 PM, Dmitry Olshansky wrote:
There is no code just jump [ecx]. The offset is included already.

I'm not seeing where "code" is in the asm code you presented.

Sorry, I have no idea why it is not the same. jumptable is a static
array, and so does not need to be loaded into a register. And code[]
needs to be loaded from somewhere, and it looks like it's omitted from
your example.

Code was assumed to be in ebx obviously. BTW Static array for jump table is all good and well but does this trick work with PIC code? And last but not least - the jump target has to be _read_ from jump table
 and then jumped to it isn't it?

OK I've taken your comments into account.
Now I think I finally got it right:

mov ecx, [ebx] ; ecx = code[pc]
inc ebx ; pc ++
jmp ecx ; goto code[pc], as ecx is already a pointer

vs

mov ecx, [ebx] ; ecx = code[pc]
inc ebx ; ; inc pc
jump jump_table[ecx]; ; switch jump to it

or in english, ommiting PC increment:
1.
read x from array
jump to it
2.
 read x from array
 read jump address from 2nd static array at offset x
 jump to it

So to sum it all up: 1 vs 2 memory accesses, same register use, same (macro) instruction count. Still I think that not having to touch extra static table is bonus and jump jump_table[ecx] could be less efficient on some processors then plain jump ecx (need to checks this).

Now the same thing with switches.
And before you ask -
NO! Single switch won't do as it would have successful branch
prediction rate of
as bad as about 2% (see the article!). The more opcode types the worse
prediction is.

I see what you mean, but this could be done with final switch by the
compiler, as I explained to bearophile.

Great but I still try to show that they are less efficient, see above.


//GLOBALS
size_t[] bytecode = firstPassCompile("my_source.ds"); //say it's
DMDscript ;)

void run(size_t pc)
{
     //here we don't JIT to real addresses beforehand
     //as jump tables somehow should be good enough
     // Okay...

     //interpret:
     switch(bytecode[pc])
     {
     L_op1:
     case 0:
[snip]
     L_opN:
     case N-1:
     ...
     //DISPATCH:
     //here comes trouble, Nth table ... time to count overhead
         switch(bytecode[pc])
         {
         case 0:    goto L_op1;
         case 1: goto L_op2;
         ...
         }
     }//end of giant switch
}

I see what you mean here, too. Thanks for the explanation. It never
occurred to me that one could write code like that, but I see the point,
and doing jump table merging could be done fairly easily. No new
language feature is required.

Superb! I actually tried the code above, generating common things with a help of string mixins, of course currently it only gets slightly slower.

Should I file an enhancement request?


--
Dmitry Olshansky

Reply via email to