[ 
https://issues.apache.org/jira/browse/LUCENE-5293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13798384#comment-13798384
 ] 

Adrien Grand commented on LUCENE-5293:
--------------------------------------

I like the idea of using the Elias-Fano doc id set given how it behaves in the 
benchmarks but it is tricky that it needs to know the size of the set in 
advance. In practice, the cache impls that you are going to have in 
CWF.cacheImpl are most likely QueryWrapperFilters, not FixedBitSets or 
OpenBitSets, so there is no way to know the exact size in advance. We could use 
DocIdSetIterator.cost but although it is recommended to implement this method 
by returning an upper bound on the number of documents in the set, it could 
return any number. Do you think there would be a way to relax the Elias-Fano 
doc id set building process so that it could be built by providing an 
approximation of the number of docs in the set (at the cost of some compression 
loss)?

> Also use EliasFanoDocIdSet in CachingWrapperFilter
> --------------------------------------------------
>
>                 Key: LUCENE-5293
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5293
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/search
>            Reporter: Paul Elschot
>            Priority: Minor
>         Attachments: LUCENE-5293.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1#6144)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to