I’m having fun calculating the average performance of different switch 
implementations, to be used in the DFG JIT and in our inline caching logic.  
I’ve got two possible implementations that appear to be very nearly optimal 
assuming the capabilities of the hardware we care about and other constraints 
of our JITs.  Both implementations involve a tree of branches, and they both 
come into play in exactly those cases where a tree of branches is already known 
to be less expensive than a lookup table.

Here’s the basic switch JIT algorithm:

jitSwitchRecurse(cases) {
    if (cases.size < N) {
        for (C in cases)  {
            emit code that checks, and executes, C
        }
        fall through to default
    } else {
        // Note that this only divides-and-conquers and does not explicitly 
check and execute the median case; I’ve calculated that this is the superior 
choice.  You can follow along with this here: 
https://bugs.webkit.org/show_bug.cgi?id=141046
        let M = right-median case in cases // for even sizes of cases, pick the 
greater of the two medians; for odd cases, pick either the median or the value 
to the right of the median at random.  the point is to split the list into two 
equal-sized lists
        lessThanJump = emit lessThan comparison against M
        jitSwitchRecurse(those cases that are >= M)
        lessThanJump.link()
        jitSwitchRecurse(those cases that are < M)
    }

The two implementations of this algorithm that I’m picking between either have 
N = 2 or N = 3 - i.e. we either stop divide-and-conquering once we have two 
remaining cases, or when we have 3 remaining cases. Those two versions are the 
most optimal for a variety of reasons, but neither is obviously better than the 
other: N = 3 is more optimal for the case that the switch bottoms out in one of 
the cases.  N = 2 is more optimal if the switch bottoms out in the default 
(“fall through”) case.  N = 2 is significantly more optimal for default: it 
usually results in two fewer comparisons before getting to default.  N = 3 is 
slightly more optimal for the non-default (i.e. hitting some explicit case) 
case: it usually results in one fewer comparisons before getting to some case.

We could have a switch emitter that selects the strategy based on profiling 
(did we hit the default case, or not?), but that isn’t always available and it 
might be overkill.  We could tune this for benchmarks, and I might do some of 
that.  But I’m curious what other people’s intuitions and experiences are: do 
your switch statements usually hit some case, or take default?  If you could 
control which one was faster, which one of those would you be most likely to 
make faster in the kind of code that you write?

-Filip

_______________________________________________
webkit-dev mailing list
[email protected]
https://lists.webkit.org/mailman/listinfo/webkit-dev

Reply via email to