[ https://issues.apache.org/jira/browse/SOLR-16858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17799220#comment-17799220 ]
Chris M. Hostetter commented on SOLR-16858: ------------------------------------------- (NOTE: I'm responding to some questions/comments out of order so that I can group the answers by logical topic) ---- {quote}I also agree with David, few people know that to run post-filtering you need ... for sure un-related with vector-based search, but still relevant ... {quote} I have conflicting opinions about the API for using PostFilters, but agree we shouldn't let that bog down this discussion of the knn parser – if one of you opens an issue regarding that please link from here and i'll try to comment there w/thoughts later. ---- {quote}In terms of the 'postFiltering bug', once we add the local param, and remove the slurping, is any bug left? {quote} I re-reviewed the code today and I do not believe there is any "postFiltering bug" on either the {{main}} branch, or the {{jira/SOLR-16858}} branch. {{KnnQParser}} 's use of {{SolrIndexSearcher.getProcessedFilter}} will ensure that any PostFilters are excluded from the "pre-filter wrapping". (We can add a test that proves it by checking the result of {{KnnVectorQuery.getFilter()}} after using {{KnnQParser}} in a request with PostFilters) I made my earlier comment about a (possible) bug in the PostFilter situation based on some confusion I had about the motivation for your PR1246. I thought _YOU_ were saying there was a bug with knn + PostFilters, which PR1246 would fix by delaying the pre-filter wrapping until after SolrIndexSearcher had computed the PostFilters (and that you had been blocked on committing PR1246 to fix the "bug" while waiting for the lucene change). Just a misunderstanding on my part. ---- {quote}I think it makes sense to have a local parameter (which we can call 'preFilter' maybe?) {quote} I'm happy to change it to " {{preFilter}} " ---- {quote}This is a breaking change, so we'll have to clarify as a warning somewhere as users will see a drastic difference (people using FQ as pre-filters will have to switch to the local parameter instead (but I think it's fair). {quote} I'm not certain if we should _EVER_ remove the fq slurping logic completely – but I DEFINITELY do not think we should do so in a way that would be a breaking change on upgrade. (if we're going to do it, it should first be deprecated, with some new param or config option to enable/disable, and wait for a major version to change the default) The reason I'm not convinced we should *EVER* completely remove the fq slurping ties into your other question... {quote}But I'm struggling to understand why we would need the include/exclude tags as local parameters for KNN. Once we set that the local preFilter is independent of the global FQs, why should we involve tags? {quote} This goes back to those other 2 usecases (or maybe "user profiles" is a better description) that I mentioned in my first comment on this jira: it makes a lot of sense for some users to have the slurping of global FQ params – as long as we give them some control over how it's done... {quote}... i've talked to some folks who think about using KNN very differently then i do, who really like the current behavior of slurping up all of the global fq params, but they really need the ability to exclude a few fq params based on tags – in the same way that multiselect faceting requires using {{excludeTags}} as you add {{fq}} params for drilldown. ... ... so that the numFound starts to decrease, as all of the facet drill down fq params are applied independently to the knn. ... Part of my idea is to not only support an {{excludeTags}} localparam, but also an {{includeTags}} param for folks who might have a lot of {{fq}} params that they use in _every_ request (ie: invariants/appends on the handler), and if/when the q param uses knn they want a large subset (but not {_}all{_}) of their fq params to be slurped up by the knn parser. {quote} The last part of your question was... {quote}Is it just to avoid re-writing a pre-filter condition that has been already specified by a tag in the global? {quote} ...and I guess that's one way of looking at it – but to me it's more about giving as much flexibility as possible for how people can specify the pre-filter. To go back to the earlier part of your question... {quote}Once we set that the local preFilter is independent of the global FQs, why should we involve tags? {quote} My thinking about this issue is that {{KnnQParser}} should support multiple ways of defining the pre-filter that exists on the resulting {{KnnVectorQuery}} : * (1) When {{KnnQParser}} is used for the main query string, FQ params are used to build the _default_ pre-filter on the {{KnnVectorQuery}} ** if you find yourself in a situation where you only want to use _some_ (but not all) of the global FQs - you can use {{includeTags}} or {{excludeTags}} localparams to adjust the set of FQ params used * (2) Regardless of where/how you use {{KnnQParser}} , a {{preFilter}} local param can be used to override the default behavior and explicit set the pre-filter on the {{KnnVectorQuery}} ...ie: there doesn't have to be one and only one way of specifying what gets put in {{KnnVectorQuery.getFilter}} – we can give the user options (and in some use cases, one option will be easier to use then others) does that make sense? ---- I'm about to go offline until January, but when I get back I'll have some time to update the jira branch to: * change the local param name to " {{preFilter}} " * add a clear-box test of the post filter situation to prove they are automatically excluded * add ref-guide edits of the new params with some examples. > Allow KnnQParser to selectively apply filters > --------------------------------------------- > > Key: SOLR-16858 > URL: https://issues.apache.org/jira/browse/SOLR-16858 > Project: Solr > Issue Type: Bug > Reporter: Joel Bernstein > Assignee: Chris M. Hostetter > Priority: Major > Labels: hybrid-search > Attachments: SOLR-16858-1.patch, SOLR-16858.patch > > Time Spent: 20m > Remaining Estimate: 0h > > The KnnQParser is parsing the filter query which limits the rows considered > by the vector query with the following method: > {code:java} > private Query getFilterQuery() throws SolrException, SyntaxError { > boolean isSubQuery = recurseCount != 0; > if (!isFilter() && !isSubQuery) { > String[] filterQueries = req.getParams().getParams(CommonParams.FQ); > if (filterQueries != null && filterQueries.length != 0) { > try { > List<Query> filters = QueryUtils.parseFilterQueries(req); > SolrIndexSearcher.ProcessedFilter processedFilter = > req.getSearcher().getProcessedFilter(filters); > return processedFilter.filter; > } catch (IOException e) { > throw new SolrException(SolrException.ErrorCode.SERVER_ERROR, e); > } > } > } > return null; > } > {code} > This is pulling all filter queries from the main query parameters and using > them to limit the vector query. This is the automatic behavior of the > KnnQParser. > There are cases where you may want to selectively apply different filters. > One such case is SOLR-16857 which involves reRanking a collapsed query. > Overriding the default filter behavior could be done by adding an "fq" local > parameter to the KnnQParser which would override the default filtering > behavior. > {code:java} > {!knn f=vector topK=10 fq=$kfq}[...]&kfq=myquery > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org