[ 
https://issues.apache.org/jira/browse/SOLR-8241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15137321#comment-15137321
 ] 

Ben Manes commented on SOLR-8241:
---------------------------------

I only used the LruCache as a template and removed much of it, though looking 
at LfuCache it might have been easier to work with, since mine was trimmed to 
look very similar. 

LFU is substantially better than LRU for search engine workloads like Solr's. I 
do not have any Solr specific metrics to offer, but the search engine traces I 
do have are very promising. LFU is superior to LRU, and TinyLFU is a 
substantial further improvement. If the impact was not so significant then I 
would not be advocating this change.

WebSearch1 @ 4M (U. of Mass.)
* Lru: 21.6%
* Lfu: 27.1%
* W-TinyLfu: 41.5%
* Optimal: 57.8%

S3 @ 400K (ARC paper)
* Lru: 12.0%
* Lfu: 29.4%
* W-TinyLfu: 43.0%
* Optimal: 60.0%

Yes, this is Java 8 only. The interface of RemovalListener was changed from 
v1.x to v2.x in order to be friendlier lambda syntax for the builder's type 
inference.

Please read this [short 
article|http://highscalability.com/blog/2016/1/25/design-of-a-modern-cache.html]
 which describes the policy and concurrency mechanism. That should provide you 
enough context to judge this change without taking a deep dive into the 
library's implementation. The patch to Solr's code is quite small.

> Evaluate W-TinyLfu cache
> ------------------------
>
>                 Key: SOLR-8241
>                 URL: https://issues.apache.org/jira/browse/SOLR-8241
>             Project: Solr
>          Issue Type: Wish
>          Components: search
>            Reporter: Ben Manes
>            Priority: Minor
>         Attachments: SOLR-8241.patch
>
>
> SOLR-2906 introduced an LFU cache and in-progress SOLR-3393 makes it O(1). 
> The discussions seem to indicate that the higher hit rate (vs LRU) is offset 
> by the slower performance of the implementation. An original goal appeared to 
> be to introduce ARC, a patented algorithm that uses ghost entries to retain 
> history information.
> My analysis of Window TinyLfu indicates that it may be a better option. It 
> uses a frequency sketch to compactly estimate an entry's popularity. It uses 
> LRU to capture recency and operate in O(1) time. When using available 
> academic traces the policy provides a near optimal hit rate regardless of the 
> workload.
> I'm getting ready to release the policy in Caffeine, which Solr already has a 
> dependency on. But, the code is fairly straightforward and a port into Solr's 
> caches instead is a pragmatic alternative. More interesting is what the 
> impact would be in Solr's workloads and feedback on the policy's design.
> https://github.com/ben-manes/caffeine/wiki/Efficiency



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to