21-Jul-2014 00:50, Walter Bright пишет:
On 7/20/2014 3:27 AM, Dmitry Olshansky wrote:
Functional programming is full of simple recursion and it would be
nice not to
stack overflow in debug builds.

Traditional FP languages don't have loops, and so must do recursion. D
has loops, even in pure functions, there's no reason not to use them.


Another use case is so-called threaded code interpreter, that can be
done with
either computed gotos (and bunch of labels) or forced tail calls (and
bunch of
functions). In both cases computed jump or tail call is indirect.

http://eli.thegreenplace.net/2012/07/12/computed-goto-for-efficient-dispatch-tables/


Actually that blog entry is wrong about it. Computed goto won't help when used as direct replacement of switch and you are correct in that the said code could be easily optimized with final switch.

What would help a lot is thread-code interpreter, that is bytecode contains direct addresses used in computed goto.

The computed goto is faster for two reasons, according to the article:

1.The switch does a bit more per iteration because of bounds checking.

Now let's consider proper implementation of thread-code interpreter.
where *code pointer points to an array of addresses. We've been through this before and it turns out switch is slower because of an extra load.

a) Switch does 1 load for opcode, 1 load for the jump table, 1 indirect jump to advance
(not even counting bounds checking of the switch)

b) Threaded-code via (say) computed goto does 1 load of opcode and 1 indirect jump, because opcode is an address already (so there is no separate jump table).

I'm certain that forced tail call would work just fine instead of computed goto for this scenario. In fact I've measured this with LDC and the results are promising but only work with -O2/-O3 (where tail call is optimized). I'd gladly dig them up if you are interested.


--
Dmitry Olshansky

Reply via email to