[
https://issues.apache.org/jira/browse/LUCENE-5293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13798384#comment-13798384
]
Adrien Grand commented on LUCENE-5293:
--------------------------------------
I like the idea of using the Elias-Fano doc id set given how it behaves in the
benchmarks but it is tricky that it needs to know the size of the set in
advance. In practice, the cache impls that you are going to have in
CWF.cacheImpl are most likely QueryWrapperFilters, not FixedBitSets or
OpenBitSets, so there is no way to know the exact size in advance. We could use
DocIdSetIterator.cost but although it is recommended to implement this method
by returning an upper bound on the number of documents in the set, it could
return any number. Do you think there would be a way to relax the Elias-Fano
doc id set building process so that it could be built by providing an
approximation of the number of docs in the set (at the cost of some compression
loss)?
> Also use EliasFanoDocIdSet in CachingWrapperFilter
> --------------------------------------------------
>
> Key: LUCENE-5293
> URL: https://issues.apache.org/jira/browse/LUCENE-5293
> Project: Lucene - Core
> Issue Type: Improvement
> Components: core/search
> Reporter: Paul Elschot
> Priority: Minor
> Attachments: LUCENE-5293.patch
>
>
--
This message was sent by Atlassian JIRA
(v6.1#6144)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]