[ https://issues.apache.org/jira/browse/SOLR-13890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002960#comment-17002960 ]
David Smiley commented on SOLR-13890: ------------------------------------- bq. The TPI "approximation" for DocValuesTermsQuery is the unfiltered doc-values structure for the field. As a result TPI matches() is going to be called on all documents that have any value at all for the field in question. Under a post-filter implementation, the bitset lookup is (potentially) called much less frequently, as we only lookup values for docs that have matched all the other (non-postfilter) query clauses. Does that make sense, or am I off-base David Smiley? I don't think this characterization is accurate. If other lower cost queries are in play then TPI matches() won't be called if the document can be excluded by them. bq. So, the postfilters behavior (not cached in filter cache) provides the best solution for certain situations where the filter cache is problematic. We can make an estimation that it's best to not cache; the Query could implement ExtendedQuery to return a default getCache() of false sometimes. Perhaps always default to false... maybe all O(docs) queries should default this way. Perhaps a better heuristic, is how long the IndexSearcher has been open for. Regardless the user can & should retain the ability to be explicit if he/she chooses. > Add postfilter support to {!terms} queries > ------------------------------------------ > > Key: SOLR-13890 > URL: https://issues.apache.org/jira/browse/SOLR-13890 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: query parsers > Affects Versions: master (9.0) > Reporter: Jason Gerlowski > Assignee: Jason Gerlowski > Priority: Major > Attachments: SOLR-13890.patch, SOLR-13890.patch, SOLR-13890.patch > > > There are some use-cases where it'd be nice if the "terms" qparser created a > query that could be run as a postfilter. Particularly, when users are > checking for hundreds or thousands of terms, a postfilter implementation can > be more performant than the standard processing. > WIth this issue, I'd like to propose a post-filter implementation for the > {{docValuesTermsFilter}} "method". Postfilter creation can use a > SortedSetDocValues object to populate a DV bitset with the "terms" being > checked for. Each document run through the post-filter can look at their > doc-values for the field in question and check them efficiently against the > constructed bitset. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org