[ 
https://issues.apache.org/jira/browse/SOLR-8241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15517090#comment-15517090
 ] 

Ben Manes commented on SOLR-8241:
---------------------------------

Expiration is tricky because it means the data is no longer valid to be 
consumed and should not be consumed. The middle ground here is to 
refreshAfterWrite, which serves stale entries and tries to asynchronously 
reload the value. That covers the common case by not penalizing active entries 
by evicting, while letting inactive ones expire.

That probably isn't enough and its impossible to cover all use-cases. So 
instead its more of a data structure to (hopefully) be malleable to have custom 
workarounds. The CacheWriter can be used to create a victim cache, which a 
CacheLoader could retrieve from. So you could let expired entries populate the 
victim and be promoted back into the cache, sometimes within the same atomic 
operation. Then a rewarming could clear the victim when its done as its 
contents are unnecessary. So something like this is might be workable.

> Evaluate W-TinyLfu cache
> ------------------------
>
>                 Key: SOLR-8241
>                 URL: https://issues.apache.org/jira/browse/SOLR-8241
>             Project: Solr
>          Issue Type: Wish
>          Components: search
>            Reporter: Ben Manes
>            Priority: Minor
>         Attachments: SOLR-8241.patch
>
>
> SOLR-2906 introduced an LFU cache and in-progress SOLR-3393 makes it O(1). 
> The discussions seem to indicate that the higher hit rate (vs LRU) is offset 
> by the slower performance of the implementation. An original goal appeared to 
> be to introduce ARC, a patented algorithm that uses ghost entries to retain 
> history information.
> My analysis of Window TinyLfu indicates that it may be a better option. It 
> uses a frequency sketch to compactly estimate an entry's popularity. It uses 
> LRU to capture recency and operate in O(1) time. When using available 
> academic traces the policy provides a near optimal hit rate regardless of the 
> workload.
> I'm getting ready to release the policy in Caffeine, which Solr already has a 
> dependency on. But, the code is fairly straightforward and a port into Solr's 
> caches instead is a pragmatic alternative. More interesting is what the 
> impact would be in Solr's workloads and feedback on the policy's design.
> https://github.com/ben-manes/caffeine/wiki/Efficiency



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to