On Sat, Oct 01, 2022 at 01:20:08PM +0000, realhet via Digitalmars-d-learn wrote: > Hello, > > I just wanted to optimize a byte -> index lookup, by using a 256 > element table instead of using [1, 2, 3].countUntil(x) and I was > amazed what I've found. > My solution lookup[x] was not faster at all, because LDC2 amazingly > optimized the linear search to a jump table. > > Anyone please can tell me, how it is possible?
Because it's LDC. ;-) Or more specifically, the D front-end tells the LLVM back-end that the table size is statically-known, and the back-end pattern-matched linear searches through fixed-size tables into a jump table. Generally, I wouldn't bother with optimizing minutiae like these unless the profiler indicates the hotspot is there. Just let the optimizer do its job. Quite often the optimizer can do a better job than a human can. > I looked inside the countUntil() template and I see no static ifs for > compile time optimizations. It has nothing to do with the Phobos code, it's a pattern recognized by LDC's optimizer. > Is it true that: The compiler notices that there is a while loop on a > static compile time array and it is clever enough to solve all the > calculations in compile time and generate a completely unrolled loop? > Then the optimizer transforms it to the jump table and do it with "jmp > rax" ? This is quite likely the case. I've personally seen LDC optimize an entire benchmark into a single instruction, because it figured out that it could run the function at compile-time and replace the entire function body with an instruction that just loads the final value. > If I'm right, this is not just std library magic, it works even with > my own template functions. > I'm also getting crazy long compile times, so this gotta be the case > :D [...] Yep. T -- If it tastes good, it's probably bad for you.
