[ 
https://issues.apache.org/jira/browse/HBASE-15560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15549086#comment-15549086
 ] 

Eshcar Hillel commented on HBASE-15560:
---------------------------------------

Hi, can I add my view of this issue ? 

I think the gap between what is required by the community and what can be 
provided is not that big.

1) [~ben.manes] you already have the results of the YCSB benchmark you ran with 
the initial patch.
Can you rerun these tests with the latest patch and publish the results in some 
form.
I suggest you publish the exact settings you used plus raw results (rather than 
lift).
You can either present a comparison table of the mean latency + high 
(95th/99th) percentiles, over different cache sizes, or depict the dynamic of 
the latency throughout the run in a graph (by using the '-s' flag -- I can 
explain offline), or best do both. 
If you dig in the region server log you can find records of the hit ratio, 
which you can also depict alongside the latency; could be nice to see.
This results would show that when combining HBase and Caffeine there is no 
overhead and in some cases a measurable benefit, even in synthetic workloads.

2) [~stack] if the results of these experiments would satisfy the community 
then the default can be switched to TinyLFU, with LRU being optional and pushed 
to master. This would allow the community to further experiment with this 
feature more easily, and to modify it if necessary.

3) Ben briefly described the results of the benchmarks when using a static 
distribution. Here is my explanation of the results (Ben feel free to correct 
me if I'm wrong):
The distribution of the items is skewed but *static* with a small (high 
frequency) head and a long (low frequency) tail.
With a given cache size -- after the cache is warm -- the items at the head 
feel the second segment (which is 80% of the cache in TinyLFU) and the 
following items feel the first segment.
With LRU from time to time items from the tail of the distribution cause 
eviction from the first segment which is later translated to cache misses and 
increased latency, while TinyLFU tends to keep items with higher frequency in 
the cache, which results in less misses. As the size of the cache grows less 
and less items are evicted from the cache and the difference diminishes.
With *dynamic* distribution items are continuously evicted from the cache and 
here the benefit of TinyLFU should be clearly pronounced.
We have traces of production workloads that would potentially have skewed 
dynamic probability.
However, we can neither share them and currently don't have the resources to 
turn them into a running benchmark.
We could try to make an effort at this direction if this becomes a 
make-it-or-break-it point.

Would this be acceptable: 1) [~ben.manes] running static YCSB benchmark; 2) 
[~stack] commit TinyLFU as a default; 3) benchmark with dynamic workloads, 
either by us or others in the community.

> TinyLFU-based BlockCache
> ------------------------
>
>                 Key: HBASE-15560
>                 URL: https://issues.apache.org/jira/browse/HBASE-15560
>             Project: HBase
>          Issue Type: Improvement
>          Components: BlockCache
>    Affects Versions: 2.0.0
>            Reporter: Ben Manes
>            Assignee: Ben Manes
>         Attachments: HBASE-15560.patch, HBASE-15560.patch, HBASE-15560.patch, 
> HBASE-15560.patch, HBASE-15560.patch, HBASE-15560.patch, HBASE-15560.patch, 
> tinylfu.patch
>
>
> LruBlockCache uses the Segmented LRU (SLRU) policy to capture frequency and 
> recency of the working set. It achieves concurrency by using an O( n ) 
> background thread to prioritize the entries and evict. Accessing an entry is 
> O(1) by a hash table lookup, recording its logical access time, and setting a 
> frequency flag. A write is performed in O(1) time by updating the hash table 
> and triggering an async eviction thread. This provides ideal concurrency and 
> minimizes the latencies by penalizing the thread instead of the caller. 
> However the policy does not age the frequencies and may not be resilient to 
> various workload patterns.
> W-TinyLFU ([research paper|http://arxiv.org/pdf/1512.00727.pdf]) records the 
> frequency in a counting sketch, ages periodically by halving the counters, 
> and orders entries by SLRU. An entry is discarded by comparing the frequency 
> of the new arrival (candidate) to the SLRU's victim, and keeping the one with 
> the highest frequency. This allows the operations to be performed in O(1) 
> time and, though the use of a compact sketch, a much larger history is 
> retained beyond the current working set. In a variety of real world traces 
> the policy had [near optimal hit 
> rates|https://github.com/ben-manes/caffeine/wiki/Efficiency].
> Concurrency is achieved by buffering and replaying the operations, similar to 
> a write-ahead log. A read is recorded into a striped ring buffer and writes 
> to a queue. The operations are applied in batches under a try-lock by an 
> asynchronous thread, thereby track the usage pattern without incurring high 
> latencies 
> ([benchmarks|https://github.com/ben-manes/caffeine/wiki/Benchmarks#server-class]).
> In YCSB benchmarks the results were inconclusive. For a large cache (99% hit 
> rates) the two caches have near identical throughput and latencies with 
> LruBlockCache narrowly winning. At medium and small caches, TinyLFU had a 
> 1-4% hit rate improvement and therefore lower latencies. The lack luster 
> result is because a synthetic Zipfian distribution is used, which SLRU 
> performs optimally. In a more varied, real-world workload we'd expect to see 
> improvements by being able to make smarter predictions.
> The provided patch implements BlockCache using the 
> [Caffeine|https://github.com/ben-manes/caffeine] caching library (see 
> HighScalability 
> [article|http://highscalability.com/blog/2016/1/25/design-of-a-modern-cache.html]).
> Edward Bortnikov and Eshcar Hillel have graciously provided guidance for 
> evaluating this patch ([github 
> branch|https://github.com/ben-manes/hbase/tree/tinylfu]).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to