[ https://issues.apache.org/jira/browse/HBASE-15560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16815950#comment-16815950 ]
Sean Busbey commented on HBASE-15560: ------------------------------------- okay in that case I'm +1 > TinyLFU-based BlockCache > ------------------------ > > Key: HBASE-15560 > URL: https://issues.apache.org/jira/browse/HBASE-15560 > Project: HBase > Issue Type: Improvement > Components: BlockCache > Affects Versions: 2.0.0 > Reporter: Ben Manes > Assignee: Ben Manes > Priority: Major > Fix For: 3.0.0, 2.3.0 > > Attachments: HBASE-15560.patch, HBASE-15560.patch, HBASE-15560.patch, > HBASE-15560.patch, HBASE-15560.patch, HBASE-15560.patch, HBASE-15560.patch, > HBASE-15560.patch, HBASE-15560.patch, HBASE-15560.patch, HBASE-15560.patch, > HBASE-15560.patch, bc.hit.count, bc.miss.count, branch-1.tinylfu.txt, gets, > run_ycsb_c.sh, run_ycsb_loading.sh, tinylfu.patch > > > LruBlockCache uses the Segmented LRU (SLRU) policy to capture frequency and > recency of the working set. It achieves concurrency by using an O( n ) > background thread to prioritize the entries and evict. Accessing an entry is > O(1) by a hash table lookup, recording its logical access time, and setting a > frequency flag. A write is performed in O(1) time by updating the hash table > and triggering an async eviction thread. This provides ideal concurrency and > minimizes the latencies by penalizing the thread instead of the caller. > However the policy does not age the frequencies and may not be resilient to > various workload patterns. > W-TinyLFU ([research paper|http://arxiv.org/pdf/1512.00727.pdf]) records the > frequency in a counting sketch, ages periodically by halving the counters, > and orders entries by SLRU. An entry is discarded by comparing the frequency > of the new arrival (candidate) to the SLRU's victim, and keeping the one with > the highest frequency. This allows the operations to be performed in O(1) > time and, though the use of a compact sketch, a much larger history is > retained beyond the current working set. In a variety of real world traces > the policy had [near optimal hit > rates|https://github.com/ben-manes/caffeine/wiki/Efficiency]. > Concurrency is achieved by buffering and replaying the operations, similar to > a write-ahead log. A read is recorded into a striped ring buffer and writes > to a queue. The operations are applied in batches under a try-lock by an > asynchronous thread, thereby track the usage pattern without incurring high > latencies > ([benchmarks|https://github.com/ben-manes/caffeine/wiki/Benchmarks#server-class]). > In YCSB benchmarks the results were inconclusive. For a large cache (99% hit > rates) the two caches have near identical throughput and latencies with > LruBlockCache narrowly winning. At medium and small caches, TinyLFU had a > 1-4% hit rate improvement and therefore lower latencies. The lack luster > result is because a synthetic Zipfian distribution is used, which SLRU > performs optimally. In a more varied, real-world workload we'd expect to see > improvements by being able to make smarter predictions. > The provided patch implements BlockCache using the > [Caffeine|https://github.com/ben-manes/caffeine] caching library (see > HighScalability > [article|http://highscalability.com/blog/2016/1/25/design-of-a-modern-cache.html]). > Edward Bortnikov and Eshcar Hillel have graciously provided guidance for > evaluating this patch ([github > branch|https://github.com/ben-manes/hbase/tree/tinylfu]). -- This message was sent by Atlassian JIRA (v7.6.3#76005)