[
https://issues.apache.org/jira/browse/LUCENE-5293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13798886#comment-13798886
]
Paul Elschot commented on LUCENE-5293:
--------------------------------------
bq. ... the closest thing we have to a cardinality() which is available for
every DocIdSet.
For a single term used as a filter there is IndexReader.docFreq(Term), and that
does fit in here.
(Ideally in this case the posting list could be copied from the index into the
cache, but we're not there yet.)
Shall I add a check for QueryWrapperFilter.getQuery() being a TermQuery and
then use docFreq() ?
There is also a TermFilter in the queries module. I have not looked at the code
yet.
Is it necessary to move that to core so a check for that can be used here, too?
bq. ... maybe we could build a WAH8DocIdSet in any case and replace it with an
EF doc id set when there are not many documents???
For any case in which the cardinality can not easily be determined, that indeed
would make sense from the benchmark.
bq. I can try to update the benchmarks to add the building of an additional FBS
before the EF doc id set.
Adding a single FBS build to the EF DocIdSet can be visualized in the benchmark
for the build times.
In that case the EF DocIdSet build result never gets above log(1) = 0, so an
update of the benchmarks would not be needed.
> Also use EliasFanoDocIdSet in CachingWrapperFilter
> --------------------------------------------------
>
> Key: LUCENE-5293
> URL: https://issues.apache.org/jira/browse/LUCENE-5293
> Project: Lucene - Core
> Issue Type: Improvement
> Components: core/search
> Reporter: Paul Elschot
> Priority: Minor
> Attachments: LUCENE-5293.patch
>
>
--
This message was sent by Atlassian JIRA
(v6.1#6144)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]