Github user ben-manes commented on the issue:
https://github.com/apache/metron/pull/940
Internally Guava uses a `ConcurrentLinkedQueue` and an `AtomicInteger` to
record its size, per segment. When a read occurs, it records that in the queue
and then drains it under the segment's lock (via tryLock) to replay the events.
This is similar to Caffeine, which uses optimized structures instead. I
intended the CLQ & counter as baseline scaffolding for replacement, as it is an
obvious bottleneck, but I could never get it replaced despite advocating for
it. The penalty of draining the buffers is amortized, but unfortunately this
buffer isn't capped.
Since there would be a higher hit rate with a larger cache, the reads would
be recorded more often. Perhaps contention there and the penalty of draining
the queue is more observable than a cache miss. That's still surprising since a
cache miss is usually more expensive I/O. Is the loader doing expensive work in
your case?
Caffeine gets around this problem by using more optimal buffers and being
lossy (on reads only) if it can't keep up. By default it delegates the
amortized maintenance work to a ForkJoinPool to avoid user-facing latencies,
since you'll want those variances to be tight. Much of that can be back ported
onto Guava for a nice boost.
---