[
https://issues.apache.org/jira/browse/SOLR-14166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008410#comment-17008410
]
David Smiley commented on SOLR-14166:
-------------------------------------
The PR has the code details but I want to mention some more bigger picture here.
I have this as a sub-task of Remove/refactor Filter because this reduces the
use of the old Filter abstraction. SolrIndexSearcher.ProcessedFilter.filter is
now declared as a Query. SolrIndexSearcher no longer has FilterImpl. Now that
pf.filter is a Query, this allowed for SolrIndexSearcher.getDocSet(List<Query>
fqs) to be simpler and allowed me to remove the similar getDocSetScore.
So how is TwoPhaseIterator used efficiently you may ask? BooleanQuery's FILTER
clauses use this internally via ConjunctionDISI. I modified
SolrIndexSearcher.getProcessedFilter to create a BooleanQuery with these FILTER
clauses for the non-cached queries.
Unfortunately we lose the ability for the "cost" param on these non-cached
filter queries to have meaning. Instead, the Queries themselves and any TPIs
they may have ought to have suitable costs, and they are not externally
configurable. Maybe we could make a wrapping query that wraps the underlying
TPI.matchCost... or just not bother, letting the queries themselves actually
compute an internal cost that is perhaps better than whatever the user
supplies. I lean this way; less complexity. Unfortunately,
ValueSourceScorer's TPI matchCost is a constant 100 instead of varying based on
the particular FunctionValues implementation. That should be its own issue to
address.
> Use TwoPhaseIterator for non-cached filter queries
> --------------------------------------------------
>
> Key: SOLR-14166
> URL: https://issues.apache.org/jira/browse/SOLR-14166
> Project: Solr
> Issue Type: Sub-task
> Reporter: David Smiley
> Assignee: David Smiley
> Priority: Major
> Time Spent: 10m
> Remaining Estimate: 0h
>
> "fq" filter queries that have cache=false and which aren't processed as a
> PostFilter (thus either aren't a PostFilter or have a cost < 100) are
> processed in SolrIndexSearcher using a custom Filter thingy which uses a
> cost-ordered series of DocIdSetIterators. This is not TwoPhaseIterator
> aware, and thus the match() method may be called on docs that ideally would
> have been filtered by lower-cost filter queries.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]