On Thu, 29 Oct 2020 14:23:04 GMT, Claes Redestad <redes...@openjdk.org> wrote:
>> The static `ThreadHeapSampler::_log_table` is currently initialized on JVM >> bootstrap to an overhead of ~67k instructions (linux-x64). By turning the >> initialization into a constexpr, we can precalculate the helper table at >> compile time, which trades a runtime overhead for a small, 8kb, static >> footprint increase. >> >> I compared `fast_log2` with the `log2` builtin with a naive benchmarking >> experiment[1] (not included in this PR) and show that the `fast_log2` is >> ~2.5x faster than `log2` on my system. And that without the lookup table >> we'd be much worse. So I think it makes sense to preserve this optimization, >> but get rid of the startup overhead: >> >> [5.428s][debug][heapsampling] log2, 0.0751173 secs >> [5.457s][debug][heapsampling] fast_log2, 0.0298244 secs >> [5.622s][debug][heapsampling] fast_log2_uncached, 0.1645569 secs >> >> I've verified that this refactoring does not affect performance in this >> naive setup. >> >> [1] https://github.com/openjdk/jdk/compare/master...cl4es:log2_micro?expand=1 > > Claes Redestad has updated the pull request incrementally with two additional > commits since the last revision: > > - Add explicit include of logging > - Add const, fix copyright Marked as reviewed by iklam (Reviewer). src/hotspot/share/runtime/threadHeapSampler.cpp line 353: > 351: const uint32_t y = x_high >> (20 - FastLogNumBits) & FastLogMask; > 352: const int32_t exponent = ((x_high >> 20) & 0x7FF) - 1023; > 353: return exponent + log_table[y]; The pre-generated values look good to me. Just a small nit. You can decide whether to take this or not: C++ will generate an error if you have too many elements in `log_table[FastLogCount] = { ...`, but won't complain if you have not enough elements. Any uninitialized elements will take up the default value of 0. To prevent cut-and-paste errors in the future, I would suggest adding something like assert(FastLogCount > 0 && log_table[FastLogCount-1] != 0, "table should be full"); ------------- PR: https://git.openjdk.java.net/jdk/pull/880