Dan Sugalski wrote: > We might want to have one fast and potentially big loop (switch or computed > goto) with all the alternate (tracing, Safe, and debugging) loops use the > indirect function dispatch so we're not wedging another 250K per loop or > something.
Absolutely. There's no gain from doing computed goto for those anyway because the per-op overhead makes direct threading impossible. Brent Dax already posted an example of why this is bad. Function calls are not slow. It's the extra jumps and table lookups that are slow. If a mode has extra over-head it won't see any advantage with computed goto over function calls. (At least this is what I've found testing on Pentium III and Athlon. Most RISC systems should see similar effects. Older CISC systems with slow function calls may be a different story.) BTW, 250K for the size of the inlined dispatch loop is way too big. The goal should be to put the hot ops inline and leave the other ones out. Ideally the dispatch loop will fit into L1 cache -- maybe 8k or so. IMHO we'd be a lot better inlining some of the PMC methods as ops instead of trig functions. ;) - Ken