On Mon, 1 Feb 2021 18:18:51 GMT, Erik Joelsson <er...@openjdk.org> wrote:
>> Under certain load, MemoryCache operations take a substantial fraction of >> the time needed to complete SSL handshakes. This series of patches improves >> performance characteristics of MemoryCache, at the cost of a functional >> change: expired entries are no longer guaranteed to be removed before live >> ones. Unused entries are still removed before used ones, and cache >> performance no longer depends on its capacity. >> >> First patch in the series contains a benchmark that can be run with `make >> test TEST="micro:CacheBench"`. >> Baseline results before any MemoryCache changes: >> Benchmark (size) (timeout) Mode Cnt Score Error Units >> CacheBench.put 20480 86400 avgt 25 83.653 ? 6.269 us/op >> CacheBench.put 20480 0 avgt 25 0.107 ? 0.001 us/op >> CacheBench.put 204800 86400 avgt 25 2057.781 ? 35.942 us/op >> CacheBench.put 204800 0 avgt 25 0.108 ? 0.001 us/op >> there's a nonlinear performance drop between 20480 and 204800 entries, >> probably attributable to CPU cache thrashing. Beyond 204800 entries the >> cache scales more linearly. >> >> Benchmark results after the 2nd and 3rd patches are pretty similar, so I'll >> only copy one: >> Benchmark (size) (timeout) Mode Cnt Score Error Units >> CacheBench.put 20480 86400 avgt 25 0.146 ? 0.002 us/op >> CacheBench.put 20480 0 avgt 25 0.108 ? 0.002 us/op >> CacheBench.put 204800 86400 avgt 25 0.150 ? 0.001 us/op >> CacheBench.put 204800 0 avgt 25 0.106 ? 0.001 us/op >> The third patch improves worst-case times on a mostly idle cache by >> scattering removal of expired entries over multiple `put` calls. It does not >> affect performance of an overloaded cache. >> >> The 4th patch removes all code that clears cached values before handing them >> over to the GC. [This >> comment](https://github.com/openjdk/jdk/commit/5859a0320334bfb6b46b62eb16b4c387641f4a2a#diff-c6bd583a97fbc4f471621fee7eab37c63718cdb6932ce357fa403cfda4b32b6fL346) >> stated that clearing values was supposed to be a GC performance >> optimization. It wasn't. Benchmark results after that commit: >> Benchmark (size) (timeout) Mode Cnt Score Error Units >> CacheBench.put 20480 86400 avgt 25 0.113 ? 0.001 us/op >> CacheBench.put 20480 0 avgt 25 0.075 ? 0.002 us/op >> CacheBench.put 204800 86400 avgt 25 0.116 ? 0.001 us/op >> CacheBench.put 204800 0 avgt 25 0.072 ? 0.001 us/op >> I wasn't expecting that much of an improvement, and don't know how to >> explain it. >> >> The 40ns difference between cache with and without a timeout can be >> attributed to 2 `System.currentTimeMillis()` calls; they were pretty slow on >> my VM. > > make/test/BuildMicrobenchmark.gmk line 97: > >> 95: SRC := $(MICROBENCHMARK_SRC), \ >> 96: BIN := $(MICROBENCHMARK_CLASSES), \ >> 97: JAVAC_FLAGS := $(JAVAC_FLAGS) --add-exports >> java.base/sun.security.util=ALL-UNNAMED, \ > > Why do you need to add $(JAVAC_FLAGS) here? I'm trying to benchmark a class that is in a non-exported package `sun.security.util`. Without this line the benchmark doesn't compile. I couldn't find any other benchmarks that access non-exported classes, so I came up with my own implementation. Is there a better way to get the benchmark to compile? ------------- PR: https://git.openjdk.java.net/jdk/pull/2255