[ 
https://issues.apache.org/jira/browse/HBASE-15560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16815785#comment-16815785
 ] 

Andrew Purtell edited comment on HBASE-15560 at 4/11/19 9:31 PM:
-----------------------------------------------------------------

{quote}but the fact that tests were changed to only test LRUCache (or only 
assert data is correct if LRUCache) was concerning to me. That indicates to me 
that there is some missing functionality in the TinyLFU
{quote}
You can't make that assumption based on these changes.

What I did is change the default from "LRU" to "TinyLFU" and then execute the 
unit tests. The default remains "LRU". An invocation of the unit test suite 
will test "LRU". During the modified execution I found some places where the 
tests expect LruBlockCache. That assumption is  invalid when L1 is TinyLFU. So, 
where that assumption is made I made it conditional on the actual class. 
Nothing else has changed. This was done to exercise TinyLFU as much as possible 
given the complete unit test suite we have available. This might imply TinyLFU 
needs more test coverage, or it might not, it depends on the unit test. You 
will need to make a closer examination.

As to the rest of your points we are in agreement.

Edit: The TinyLFU contribution comes with its own unit test which is executed 
unconditionally no matter the L1 setting. This is not a comprehensive suite of 
all blockcache functionality, though. It could be a useful follow up item to 
refactor all blockcache tests to make them parameterized, and select "LRU" L1 
for one invocation; then "TinyLFU" L1 for the second invocation. This follow up 
can be readily done on branch-2 and up. On branch-1, because the compilation of 
TinyLFU is conditional and a separate module it seems more complicated and 
maybe difficult to do (some kind of conditional test if tinylfu module and jar 
is in the test classpath maybe?).


was (Author: apurtell):
{quote}but the fact that tests were changed to only test LRUCache (or only 
assert data is correct if LRUCache) was concerning to me. That indicates to me 
that there is some missing functionality in the TinyLFU
{quote}
You can't make that assumption based on these changes.

What I did is change the default from "LRU" to "TinyLFU" and then execute the 
unit tests. The default remains "LRU". An invocation of the unit test suite 
will test "LRU". During the modified execution I found some places where the 
tests expect LruBlockCache. That assumption is  invalid when L1 is TinyLFU. So, 
where that assumption is made I made it conditional on the actual class. 
Nothing else has changed. This was done to exercise TinyLFU as much as possible 
given the complete unit test suite we have available. This might imply TinyLFU 
needs more test coverage, or it might not, it depends on the unit test. You 
will need to make a closer examination.

As to the rest of your points we are in agreement.

> TinyLFU-based BlockCache
> ------------------------
>
>                 Key: HBASE-15560
>                 URL: https://issues.apache.org/jira/browse/HBASE-15560
>             Project: HBase
>          Issue Type: Improvement
>          Components: BlockCache
>    Affects Versions: 2.0.0
>            Reporter: Ben Manes
>            Assignee: Ben Manes
>            Priority: Major
>             Fix For: 3.0.0, 2.3.0
>
>         Attachments: HBASE-15560.patch, HBASE-15560.patch, HBASE-15560.patch, 
> HBASE-15560.patch, HBASE-15560.patch, HBASE-15560.patch, HBASE-15560.patch, 
> HBASE-15560.patch, HBASE-15560.patch, HBASE-15560.patch, HBASE-15560.patch, 
> HBASE-15560.patch, bc.hit.count, bc.miss.count, branch-1.tinylfu.txt, gets, 
> run_ycsb_c.sh, run_ycsb_loading.sh, tinylfu.patch
>
>
> LruBlockCache uses the Segmented LRU (SLRU) policy to capture frequency and 
> recency of the working set. It achieves concurrency by using an O( n ) 
> background thread to prioritize the entries and evict. Accessing an entry is 
> O(1) by a hash table lookup, recording its logical access time, and setting a 
> frequency flag. A write is performed in O(1) time by updating the hash table 
> and triggering an async eviction thread. This provides ideal concurrency and 
> minimizes the latencies by penalizing the thread instead of the caller. 
> However the policy does not age the frequencies and may not be resilient to 
> various workload patterns.
> W-TinyLFU ([research paper|http://arxiv.org/pdf/1512.00727.pdf]) records the 
> frequency in a counting sketch, ages periodically by halving the counters, 
> and orders entries by SLRU. An entry is discarded by comparing the frequency 
> of the new arrival (candidate) to the SLRU's victim, and keeping the one with 
> the highest frequency. This allows the operations to be performed in O(1) 
> time and, though the use of a compact sketch, a much larger history is 
> retained beyond the current working set. In a variety of real world traces 
> the policy had [near optimal hit 
> rates|https://github.com/ben-manes/caffeine/wiki/Efficiency].
> Concurrency is achieved by buffering and replaying the operations, similar to 
> a write-ahead log. A read is recorded into a striped ring buffer and writes 
> to a queue. The operations are applied in batches under a try-lock by an 
> asynchronous thread, thereby track the usage pattern without incurring high 
> latencies 
> ([benchmarks|https://github.com/ben-manes/caffeine/wiki/Benchmarks#server-class]).
> In YCSB benchmarks the results were inconclusive. For a large cache (99% hit 
> rates) the two caches have near identical throughput and latencies with 
> LruBlockCache narrowly winning. At medium and small caches, TinyLFU had a 
> 1-4% hit rate improvement and therefore lower latencies. The lack luster 
> result is because a synthetic Zipfian distribution is used, which SLRU 
> performs optimally. In a more varied, real-world workload we'd expect to see 
> improvements by being able to make smarter predictions.
> The provided patch implements BlockCache using the 
> [Caffeine|https://github.com/ben-manes/caffeine] caching library (see 
> HighScalability 
> [article|http://highscalability.com/blog/2016/1/25/design-of-a-modern-cache.html]).
> Edward Bortnikov and Eshcar Hillel have graciously provided guidance for 
> evaluating this patch ([github 
> branch|https://github.com/ben-manes/hbase/tree/tinylfu]).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to