I’m having fun calculating the average performance of different switch
implementations, to be used in the DFG JIT and in our inline caching logic.
I’ve got two possible implementations that appear to be very nearly optimal
assuming the capabilities of the hardware we care about and other constraints
of our JITs. Both implementations involve a tree of branches, and they both
come into play in exactly those cases where a tree of branches is already known
to be less expensive than a lookup table.
Here’s the basic switch JIT algorithm:
jitSwitchRecurse(cases) {
if (cases.size < N) {
for (C in cases) {
emit code that checks, and executes, C
}
fall through to default
} else {
// Note that this only divides-and-conquers and does not explicitly
check and execute the median case; I’ve calculated that this is the superior
choice. You can follow along with this here:
https://bugs.webkit.org/show_bug.cgi?id=141046
let M = right-median case in cases // for even sizes of cases, pick the
greater of the two medians; for odd cases, pick either the median or the value
to the right of the median at random. the point is to split the list into two
equal-sized lists
lessThanJump = emit lessThan comparison against M
jitSwitchRecurse(those cases that are >= M)
lessThanJump.link()
jitSwitchRecurse(those cases that are < M)
}
The two implementations of this algorithm that I’m picking between either have
N = 2 or N = 3 - i.e. we either stop divide-and-conquering once we have two
remaining cases, or when we have 3 remaining cases. Those two versions are the
most optimal for a variety of reasons, but neither is obviously better than the
other: N = 3 is more optimal for the case that the switch bottoms out in one of
the cases. N = 2 is more optimal if the switch bottoms out in the default
(“fall through”) case. N = 2 is significantly more optimal for default: it
usually results in two fewer comparisons before getting to default. N = 3 is
slightly more optimal for the non-default (i.e. hitting some explicit case)
case: it usually results in one fewer comparisons before getting to some case.
We could have a switch emitter that selects the strategy based on profiling
(did we hit the default case, or not?), but that isn’t always available and it
might be overkill. We could tune this for benchmarks, and I might do some of
that. But I’m curious what other people’s intuitions and experiences are: do
your switch statements usually hit some case, or take default? If you could
control which one was faster, which one of those would you be most likely to
make faster in the kind of code that you write?
-Filip
_______________________________________________
webkit-dev mailing list
[email protected]
https://lists.webkit.org/mailman/listinfo/webkit-dev