kevindrosendahl commented on code in PR #12780:
URL: https://github.com/apache/lucene/pull/12780#discussion_r1386992189
##########
lucene/test-framework/src/java/org/apache/lucene/tests/index/BaseKnnVectorsFormatTestCase.java:
##########
@@ -81,7 +81,8 @@ public void init() {
protected void addRandomFields(Document doc) {
switch (vectorEncoding) {
case BYTE -> doc.add(new KnnByteVectorField("v2", randomVector8(30),
similarityFunction));
- case FLOAT32 -> doc.add(new KnnFloatVectorField("v2", randomVector(30),
similarityFunction));
+ case FLOAT32 -> doc.add(
+ new KnnFloatVectorField("v2", randomNormalizedVector(30),
similarityFunction));
Review Comment:
The issue I was having was that I couldn't reproduce the issue in
`testQuantizedVectorsWriteAndRead()` because
`BaseKnnVectorsFormatTestCase::randomVector()` [always normalizes the vectors
it
creates](https://github.com/apache/lucene/blob/20d5de448a739beb85e380d153fb13bf1817a8d6/lucene/test-framework/src/java/org/apache/lucene/tests/index/BaseKnnVectorsFormatTestCase.java#L1236).
That means that the persisted vectors matched the expected vectors (even after
normalizing the expected vectors when using cosine).
To isolate the effects of the test changes from normalized to potentially
non-normalized vectors to just this test, I kept the existing behavior (i.e.
existing tests use normalized vectors) by renaming the existing method from
`randomVector()` to `randomNormalizedVector()`, then adding a new method
`randomVector()` which does not normalize the vectors. This commit
https://github.com/apache/lucene/pull/12780/commits/7cb2375e850f963630788028af34e3275fc5af09
has those changes.
Not sure if that really answers your question, I didn't try changing any
other tests to use non-normalized vectors. I'm happy to structure all of these
tests however makes sense to you.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]