[
https://issues.apache.org/jira/browse/SOLR-17736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris M. Hostetter updated SOLR-17736:
--------------------------------------
Attachment: SOLR-17736.patch
Status: Open (was: Open)
I haven't fully wrapped my head around everything in the linked GH PR#3316, but
IIUC the "meat" of the idea is that:
* IF:
** a block join {{parent}} QParser is used
** AND that {{parent}} QParser directly wraps a (single) {{KnnXxxVectorQuery}}
* THEN:
** Extract some of the properties from the {{KnnXxxVectorQuery}} (field name,
vector, topK, etc...)
** Ignore the original KNN query and replace it with a new
{{DiversifyingChildrenXxxKnnVectorQuery}} query
(correct?)
To me, this approach seems kind of "kludgy" and brittle.
In my day job, we have a small plugin that we use with Solr which creates
instances of {{DiversifyingChildren...}} queries via a simple subclass of the
existing {{KnnQParserPlugin}} using a new {{childOf}} local params (modeled
after the {{of}} param in Solr's {{child}} QParser)
This approach supports several usecases that (AFAICT) the current PR does not:
* Create {{DiversifyingChildrenXxxKnnVectorQuery}} instances even w/o a
{{parent}} wrapper
** When you want to return diverse child docs w/o joining to the parent doc
* Wrap {{parent}} queries around BooleanQuery containing multiple clauses (one
or more of which might be {{{}DiversifyingChildrenXxxKnnVectorQuery{}}})
** When you want the top scoring parents based on child scores using multiple
vector queries, either because of multiple input vectors, or because of
multiple vector fields.
** Or when you want to return parent docs based on *either* diverse topK
children *OR* some other non-vector child criteria
* Wrap {{parent}} queries around plain {{KnnXxxVectorQuery}} against children,
w/o using {{DiversifyingChildren...}}
** When you don't care about the overhead of diversification, perhaps because
you know each parent has at most one child (of a particular type) with a vector
I'm attaching a patch that adapts my current custom plugin to re-implement it
as a simple addition to the existing {{{}KnnQParserPlugin{}}}, that kicks in if
and only if a {{childOf}} local param is specified.
To my mind this approach is a lot cleaner and more versatile then the "
{{parent}} QParser wrapped around {{knn}} QParser should always throw away the
original query and build it's own" approach in the PR – but if folks prefer the
approach in PR#3316 then can we please at least make it configurable? (maybe
via a new variant of the {{parent}} QParser?)
Because as it stands right now – since
{{DiversifyingChildrenXxxxKnnVectorQuery}} extends {{KnnXxxVectorQuery}} – it
will become impossible to support some of the use cases I listed above (even
with a custom plugin) because the Solr {{parent}} QParser will start treating
_any_ {{KnnXxxVectorQuery}} it wraps (including
{{DiversifyingChildrenXxxxKnnVectorQuery}} created by a custom plugin) as
special, throwing them away and creating it's own.
> Introduce support for KNN search on nested vector documents
> -----------------------------------------------------------
>
> Key: SOLR-17736
> URL: https://issues.apache.org/jira/browse/SOLR-17736
> Project: Solr
> Issue Type: New Feature
> Components: query
> Affects Versions: 9.8
> Reporter: Alessandro Benedetti
> Priority: Major
> Labels: pull-request-available
> Attachments: SOLR-17736.patch
>
> Time Spent: 1h 20m
> Remaining Estimate: 0h
>
> This issue tracks the work of introducing the support for KNN search on
> nested vector documents, surfacing the Lucene implementation in here:
> https://github.com/apache/lucene/pull/12434
> This allows both:
> -KNN retrieval of children, applying parent filters with no denormalisation
> needed
> -KNN retrieval of parents (based on children KNN, children level prefiltering
> and parent level prefiltering)
> It's one way of having multi-valued vectors per field, per document in Solr.
> More will come soon
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]