Hi all,

I have a collection with about 2.5 Million documents.  I've been
experimenting with the KNN dense vector search  (The SOLR embedding
approximate nearest neighbour search) query search that's available in SOLR
9.0.  It works really well - it's very fast (if the specified k-nearest
results are not too big).

The only thing is the KNN search seems to be entirely exclusive to any
other query parameters.  So if I add in another query item (like
"is_enabled:true" or whatever) it's like the KNN search is just a filter
query so that *both* the KNN search AND the traditional search are done and
the results intersected and the result returned (and I think this is what
the docs say about knn as a filter query:
https://solr.apache.org/guide/solr/latest/query-guide/dense-vector-search.html
).

This is a bit of a problem because I want to be able to use "nearest
neighbour" with other refine fields ("is_enabled:true" or "color:"red"
would be examples).

Is there any way to mix the queries?  Or, if not right now, do you think
it'll be coming to later versions of SOLR ?

BTW I have also experimented with taking every float value from the
embedding vector and putting them into individual fields (and I have 512
floats in the embedding!).  Then I can use the dist() function for sorting
(so now it's a "nearest neighbour" rather than an "approximate nearest
neighbour").  This works 100% but if I query 2.5M documents it's too slow
(but if I apply query that gets me down to < 50K documents it works fine so
this is usable solution in certain situations.

Thanks for any help or info on this !

all the best,

Derek


-
Telephone (IRL): 086 856 3823
Telephone (US): (650) 443 8285
Skype: dconnrt
Email: [email protected]


*Disclaimer:* This email and any files transmitted with it are confidential
and intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please delete it
(if you are not the intended recipient you are notified that disclosing,
copying, distributing or taking any action in reliance on the contents of
this information is strictly prohibited).
*Warning*: Although HSSL have taken reasonable precautions to ensure no
viruses are present in this email, HSSL cannot accept responsibility for
any loss or damage arising from the use of this email or attachments.
P For the Environment, please only print this email if necessary.

Reply via email to