[ 
https://issues.apache.org/jira/browse/LUCENE-7680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15856493#comment-15856493
 ] 

Adrien Grand commented on LUCENE-7680:
--------------------------------------

I see this class as a default set of heuristics that should work well for most 
use-cases. If someone wants something more specific, I think the way to go 
should be to write a new impl, the API should be pretty simple to implement? As 
it stands, the class is indeed not designed for inheritance: in addition to 
those pkg-private methods, it is final.

bq. Granted I could implement minFrequencyToCache and return Integer.MAX_VALUE.

Requiring that a filter has been seen Integer.MAX_VALUE times would indeed make 
it never cached. However this change goes a bit further in the case of term 
filters: it also does not add them to the history, which makes other filters 
more likely of being cached than they are today. To take an extreme example, 
say you have a query with 100 term filters and 1 other filter (which is not a 
term). Even if that other filter was the same in every query, it would never 
get cached because term queries "pollute" the history (we only keep track of 
the last 256 used filters) and that other filter would only occur at most twice 
in the history. By not putting term filters in the history of recently used 
filters, then Lucene would be more likely to notice that that other filter gets 
reused all the time.

bq. Curious; did you consider marking TermFilter as "cheap"?

What do you mean? Maybe it is the cause of the confusion, but when I say term 
filter, I mean a TermQuery that is consumed with needsScores=false.

> Never cache term filters
> ------------------------
>
>                 Key: LUCENE-7680
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7680
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Priority: Minor
>         Attachments: LUCENE-7680.patch
>
>
> Currently we just require term filters to be used a lot in order to cache 
> them. Maybe instead we should look into never caching them. This should not 
> hurt performance since term filters are plenty fast, and would make other 
> filters more likely to be cached since we would not "pollute" the history 
> with filters that are not worth caching.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to