rmuir commented on code in PR #11946: URL: https://github.com/apache/lucene/pull/11946#discussion_r1051525868
########## lucene/core/src/java/org/apache/lucene/search/KnnVectorQuery.java: ########## @@ -76,12 +91,29 @@ public KnnVectorQuery(String field, float[] target, int k) { * @throws IllegalArgumentException if <code>k</code> is less than 1 */ public KnnVectorQuery(String field, float[] target, int k, Query filter) { + this(field, target, k, Float.NEGATIVE_INFINITY, filter); + } + + /** + * Find the <code>k</code> nearest documents to the target vector according to the vectors in the + * given field. <code>target</code> vector. + * + * @param field a field that has been indexed as a {@link KnnVectorField}. + * @param target the target of the search + * @param k the number of documents to find (the upper bound) + * @param similarityThreshold the minimum acceptable value of similarity Review Comment: still don't have any explanation here as to why we'd do this for vector search query. we avoided any such thresholds or normalization in any of lucene's scoring for decades: if we didn't do that, we would have never been able to implement block-max WAND or other algorithms because they'd be incompatible. please see: * https://cwiki.apache.org/confluence/display/LUCENE/LuceneFAQ#LuceneFAQ-CanIfilterbyscore? * https://cwiki.apache.org/confluence/display/LUCENE/ScoresAsPercentages I don't mind being the bad guy blocking this change because it seems like it has not been thought thru. You must convince me. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org