[jira] [Commented] (SOLR-16858) Allow KnnQParser to selectively apply filters

Chris M. Hostetter (Jira) Wed, 20 Dec 2023 16:55:08 -0800


    [ 
https://issues.apache.org/jira/browse/SOLR-16858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17799220#comment-17799220
 ]


Chris M. Hostetter commented on SOLR-16858:
-------------------------------------------

(NOTE: I'm responding to some questions/comments out of order so that I can 
group the answers by logical topic)
----
{quote}I also agree with David, few people know that to run post-filtering you 
need ... for sure un-related with vector-based search, but still relevant ...
{quote}
I have conflicting opinions about the API for using PostFilters, but agree we 
shouldn't let that bog down this discussion of the knn parser – if one of you 
opens an issue regarding that please link from here and i'll try to comment 
there w/thoughts later.
----
{quote}In terms of the 'postFiltering bug', once we add the local param, and 
remove the slurping, is any bug left?
{quote}
I re-reviewed the code today and I do not believe there is any "postFiltering 
bug" on either the {{main}} branch, or the {{jira/SOLR-16858}} branch. 
{{KnnQParser}} 's use of {{SolrIndexSearcher.getProcessedFilter}} will ensure 
that any PostFilters are excluded from the "pre-filter wrapping".

(We can add a test that proves it by checking the result of 
{{KnnVectorQuery.getFilter()}} after using {{KnnQParser}} in a request with 
PostFilters)

I made my earlier comment about a (possible) bug in the PostFilter situation 
based on some confusion I had about the motivation for your PR1246. I thought 
_YOU_ were saying there was a bug with knn + PostFilters, which PR1246 would 
fix by delaying the pre-filter wrapping until after SolrIndexSearcher had 
computed the PostFilters (and that you had been blocked on committing PR1246 to 
fix the "bug" while waiting for the lucene change).

Just a misunderstanding on my part.
----
{quote}I think it makes sense to have a local parameter (which we can call 
'preFilter' maybe?)
{quote}
I'm happy to change it to " {{preFilter}} "
----
{quote}This is a breaking change, so we'll have to clarify as a warning 
somewhere as users will see a drastic difference (people using FQ as 
pre-filters will have to switch to the local parameter instead (but I think 
it's fair).
{quote}
I'm not certain if we should _EVER_ remove the fq slurping logic completely – 
but I DEFINITELY do not think we should do so in a way that would be a breaking 
change on upgrade. (if we're going to do it, it should first be deprecated, 
with some new param or config option to enable/disable, and wait for a major 
version to change the default)

The reason I'm not convinced we should *EVER* completely remove the fq slurping 
ties into your other question...
{quote}But I'm struggling to understand why we would need the include/exclude 
tags as local parameters for KNN.

Once we set that the local preFilter is independent of the global FQs, why 
should we involve tags?
{quote}
This goes back to those other 2 usecases (or maybe "user profiles" is a better 
description) that I mentioned in my first comment on this jira: it makes a lot 
of sense for some users to have the slurping of global FQ params – as long as 
we give them some control over how it's done...
{quote}... i've talked to some folks who think about using KNN very differently 
then i do, who really like the current behavior of slurping up all of the 
global fq params, but they really need the ability to exclude a few fq params 
based on tags – in the same way that multiselect faceting requires using 
{{excludeTags}} as you add {{fq}} params for drilldown. ...

... so that the numFound starts to decrease, as all of the facet drill down fq 
params are applied independently to the knn.

...

Part of my idea is to not only support an {{excludeTags}} localparam, but also 
an {{includeTags}} param for folks who might have a lot of {{fq}} params that 
they use in _every_ request (ie: invariants/appends on the handler), and 
if/when the q param uses knn they want a large subset (but not {_}all{_}) of 
their fq params to be slurped up by the knn parser.
{quote}
The last part of your question was...
{quote}Is it just to avoid re-writing a pre-filter condition that has been 
already specified by a tag in the global?
{quote}
...and I guess that's one way of looking at it – but to me it's more about 
giving as much flexibility as possible for how people can specify the 
pre-filter.

To go back to the earlier part of your question...
{quote}Once we set that the local preFilter is independent of the global FQs, 
why should we involve tags?
{quote}
My thinking about this issue is that {{KnnQParser}} should support multiple 
ways of defining the pre-filter that exists on the resulting {{KnnVectorQuery}} 
:
 * (1) When {{KnnQParser}} is used for the main query string, FQ params are 
used to build the _default_ pre-filter on the {{KnnVectorQuery}}
 ** if you find yourself in a situation where you only want to use _some_ (but 
not all) of the global FQs - you can use {{includeTags}} or {{excludeTags}} 
localparams to adjust the set of FQ params used
 * (2) Regardless of where/how you use {{KnnQParser}} , a {{preFilter}} local 
param can be used to override the default behavior and explicit set the 
pre-filter on the {{KnnVectorQuery}}

...ie: there doesn't have to be one and only one way of specifying what gets 
put in {{KnnVectorQuery.getFilter}} – we can give the user options (and in some 
use cases, one option will be easier to use then others)

does that make sense?
----
I'm about to go offline until January, but when I get back I'll have some time 
to update the jira branch to:
 * change the local param name to " {{preFilter}} "
 * add a clear-box test of the post filter situation to prove they are 
automatically excluded
 * add ref-guide edits of the new params with some examples.

> Allow KnnQParser to selectively apply filters
> ---------------------------------------------
>
>                 Key: SOLR-16858
>                 URL: https://issues.apache.org/jira/browse/SOLR-16858
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Joel Bernstein
>            Assignee: Chris M. Hostetter
>            Priority: Major
>              Labels: hybrid-search
>         Attachments: SOLR-16858-1.patch, SOLR-16858.patch
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> The KnnQParser is parsing the filter query which limits the rows considered 
> by the vector query with the following method:
> {code:java}
> private Query getFilterQuery() throws SolrException, SyntaxError {
>     boolean isSubQuery = recurseCount != 0;
>     if (!isFilter() && !isSubQuery) {
>       String[] filterQueries = req.getParams().getParams(CommonParams.FQ);
>       if (filterQueries != null && filterQueries.length != 0) {
>         try {
>           List<Query> filters = QueryUtils.parseFilterQueries(req);
>           SolrIndexSearcher.ProcessedFilter processedFilter =
>               req.getSearcher().getProcessedFilter(filters);
>           return processedFilter.filter;
>         } catch (IOException e) {
>           throw new SolrException(SolrException.ErrorCode.SERVER_ERROR, e);
>         }
>       }
>     }
>     return null;
>   }
> {code}
> This is pulling all filter queries from the main query parameters and using 
> them to limit the vector query. This is the automatic behavior of the 
> KnnQParser.
> There are cases where you may want to selectively apply different filters. 
> One such case is SOLR-16857 which involves reRanking a collapsed query.
> Overriding the default filter behavior could be done by adding an "fq" local 
> parameter to the KnnQParser which would override the default filtering 
> behavior.
> {code:java}
> {!knn f=vector topK=10 fq=$kfq}[...]&kfq=myquery
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[jira] [Commented] (SOLR-16858) Allow KnnQParser to selectively apply filters

Reply via email to