mikemccand commented on a change in pull request #2022:
URL: https://github.com/apache/lucene-solr/pull/2022#discussion_r518797861



##########
File path: lucene/core/src/java/org/apache/lucene/index/VectorValues.java
##########
@@ -74,6 +74,18 @@ public BytesRef binaryValue() throws IOException {
     throw new UnsupportedOperationException();
   }
 
+  /**
+   * Return the k nearest neighbor documents as determined by comparison of 
their vector values
+   * for this field, to the given vector, by the field's search strategy. If 
the search strategy is
+   * reversed, lower values indicate nearer vectors, otherwise higher scores 
indicate nearer
+   * vectors. Unlike relevance scores, vector scores may be negative.
+   * @param target the vector-valued query
+   * @param k      the number of docs to return
+   * @param fanout control the accuracy/speed tradeoff - larger values give 
better recall at higher cost

Review comment:
       > Yeah, I think there needs to be a follow-on exposing the index-time 
controls, which indeed are much more potent than this search-time fanout, which 
has only a small impact on recall and latency. In this patch they are globals 
in HnswGraphBuilder, but there is no API for setting them.
   
   OK, makes sense.
   
   > I am thinking the index-time hyperparameters would be specified in 
IndexWriterConfig?
   
   Hmm, maybe these could be codec level controls?  Or maybe `FieldInfo`?  They 
would be per-vector-field configuration right?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to