[ 
https://issues.apache.org/jira/browse/LUCENE-6077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated LUCENE-6077:
---------------------------------
    Attachment: LUCENE-6077.patch

Updated patch:
 - CachingWrapperFilter now uses a policy that only caches on merged segments 
by default (instead of all segments)
 - applied other suggestions about typos/naming

> Add a filter cache
> ------------------
>
>                 Key: LUCENE-6077
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6077
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Assignee: Adrien Grand
>            Priority: Minor
>             Fix For: 5.0
>
>         Attachments: LUCENE-6077.patch, LUCENE-6077.patch
>
>
> Lucene already has filter caching abilities through CachingWrapperFilter, but 
> CachingWrapperFilter requires you to know which filters you want to cache 
> up-front.
> Caching filters is not trivial. If you cache too aggressively, then you slow 
> things down since you need to iterate over all documents that match the 
> filter in order to load it into an in-memory cacheable DocIdSet. On the other 
> hand, if you don't cache at all, you are potentially missing interesting 
> speed-ups on frequently-used filters.
> Something that would be nice would be to have a generic filter cache that 
> would track usage for individual filters and make the decision to cache or 
> not a filter on a given segments based on usage statistics and various 
> heuristics, such as:
>  - the overhead to cache the filter (for instance some filters produce 
> DocIdSets that are already cacheable)
>  - the cost to build the DocIdSet (the getDocIdSet method is very expensive 
> on some filters such as MultiTermQueryWrapperFilter that potentially need to 
> merge lots of postings lists)
>  - the segment we are searching on (flush segments will likely be merged 
> right away so it's probably not worth building a cache on such segments)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to