kaivalnp commented on code in PR #16092:
URL: https://github.com/apache/lucene/pull/16092#discussion_r3351355507
##########
lucene/core/src/java/org/apache/lucene/search/KnnFloatVectorQuery.java:
##########
@@ -95,6 +98,26 @@ public KnnFloatVectorQuery(
String field, float[] target, int k, Query filter, KnnSearchStrategy
searchStrategy) {
super(field, k, filter, searchStrategy);
this.target = VectorUtil.checkFinite(Objects.requireNonNull(target,
"target"));
+ this.targetPreRotated = false;
+ }
+
+ private boolean targetPreRotated;
+
+ @Override
+ public Query rewrite(IndexSearcher indexSearcher) throws IOException {
+ if (targetPreRotated == false) {
+ FieldInfo fi =
+
FieldInfos.getMergedFieldInfos(indexSearcher.getIndexReader()).fieldInfo(field);
+ if (fi != null
+ && fi.getVectorDimension() == target.length
+ &&
"true".equals(fi.getAttribute(RotationAwareKnnVectorsFormat.ROTATION_ENABLED_KEY)))
{
+ float[] rotated = new float[target.length];
+ HadamardRotation.forDimension(fi.getVectorDimension()).rotate(target,
rotated);
Review Comment:
I'm not a fan of this pattern -- the responsibility of processing the query
vector should lie with the format.
This leads to a possibility of vector queries producing garbage results if
they do not consume this attribute correctly (for example,
[`FloatVectorSimilarityQuery`](https://github.com/apache/lucene/blob/6ce7f9cfe2006bba4271b92078ef084a84d86d12/lucene/core/src/java/org/apache/lucene/search/FloatVectorSimilarityQuery.java#L36)
was missed in this PR).
Perhaps one way to catch such cases is to randomly wrap the KNN format being
tested with a rotating one in
[`BaseKnnVectorsFormatTestCase`](https://github.com/apache/lucene/blob/6ce7f9cfe2006bba4271b92078ef084a84d86d12/lucene/test-framework/src/java/org/apache/lucene/tests/index/BaseKnnVectorsFormatTestCase.java#L138-L140)
(or if we can accept slower tests: explicitly test everything with and without
rotation).
That said, readers / writers in
[`KnnVectorsFormat`](https://github.com/apache/lucene/blob/6ce7f9cfe2006bba4271b92078ef084a84d86d12/lucene/core/src/java/org/apache/lucene/codecs/KnnVectorsFormat.java#L36)
are segment-specific and not suitable to process rotation here, perhaps we
need a global pre-processing step in the format itself?
I haven't thought too deeply about what the API would look like though.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]