Hi all, I have a collection with about 2.5 Million documents. I've been experimenting with the KNN dense vector search (The SOLR embedding approximate nearest neighbour search) query search that's available in SOLR 9.0. It works really well - it's very fast (if the specified k-nearest results are not too big).
The only thing is the KNN search seems to be entirely exclusive to any other query parameters. So if I add in another query item (like "is_enabled:true" or whatever) it's like the KNN search is just a filter query so that *both* the KNN search AND the traditional search are done and the results intersected and the result returned (and I think this is what the docs say about knn as a filter query: https://solr.apache.org/guide/solr/latest/query-guide/dense-vector-search.html ). This is a bit of a problem because I want to be able to use "nearest neighbour" with other refine fields ("is_enabled:true" or "color:"red" would be examples). Is there any way to mix the queries? Or, if not right now, do you think it'll be coming to later versions of SOLR ? BTW I have also experimented with taking every float value from the embedding vector and putting them into individual fields (and I have 512 floats in the embedding!). Then I can use the dist() function for sorting (so now it's a "nearest neighbour" rather than an "approximate nearest neighbour"). This works 100% but if I query 2.5M documents it's too slow (but if I apply query that gets me down to < 50K documents it works fine so this is usable solution in certain situations. Thanks for any help or info on this ! all the best, Derek - Telephone (IRL): 086 856 3823 Telephone (US): (650) 443 8285 Skype: dconnrt Email: [email protected] *Disclaimer:* This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please delete it (if you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited). *Warning*: Although HSSL have taken reasonable precautions to ensure no viruses are present in this email, HSSL cannot accept responsibility for any loss or damage arising from the use of this email or attachments. P For the Environment, please only print this email if necessary.
