mikemccand commented on issue #13867:
URL: https://github.com/apache/lucene/issues/13867#issuecomment-2400880851
OK I checked out git tag `releases/lucene/9.11.1` and made this small diff:
```
---
a/lucene/backward-codecs/src/test/org/apache/lucene/backward_index/TestInt8HnswBackwardsCompatibility.java
+++
b/lucene/backward-codecs/src/test/org/apache/lucene/backward_index/TestInt8HnswBackwardsCompatibility.java
@@ -117,7 +116,7 @@ public class TestInt8HnswBackwardsCompatibility extends
BackwardsCompatibilityTe
IndexWriterConfig conf =
new IndexWriterConfig(new MockAnalyzer(random()))
.setMaxBufferedDocs(10)
- .setCodec(TestUtil.getDefaultCodec())
+ .setCodec(getCodec())
.setMergePolicy(NoMergePolicy.INSTANCE);
try (IndexWriter writer = new IndexWriter(dir, conf)) {
for (int i = 0; i < DOC_COUNT; i++) {
```
I think that explains why the bwc indices did not in fact test `int8` (nor
`int7`) quantization ... and why the bwc tests then did not fail with my
original PR. It makes me wonder what other bwc indices are in fact not testing
what they/we think they are testing because they used the default codec ...
This is once again the dreaded "who tests the tester!" problem. Turtles all
the way down ...
With the above diff, I then ran this command (still in 9.11.1 clone) to
regenerate all bwc indices:
```
./gradlew test -Ptests.bwcdir=/l/9111/tmp -Ptests.useSecurityManager=false
--tests TestGenerateBwcIndices -Dtests.verbose=true --max-workers=1
```
Then, I copied the newly generated `int8_hnsw.9.11.1.zip` into my `9.12.x`
clone's
`./lucene/backward-codecs/src/test/org/apache/lucene/backward_index/int8_hnsw.9.11.1.zip`
and re-ran `./gradlew test --tests TestInt8HnswBackwardsCompatibility` and now
it fails (phew!) with:
```
> java.lang.IllegalStateException: Quantized vector data length 70
not matching size=10 * (dim=3 + 4) = 60
> at
[email protected]/org.apache.lucene.codecs.lucene99.Lucene99ScalarQuantizedVectorsReader.validateFieldEntry(Lucene99Sc\
alarQuantizedVectorsReader.java:149)
> at
[email protected]/org.apache.lucene.codecs.lucene99.Lucene99ScalarQuantizedVectorsReader.readFields(Lucene99ScalarQuan\
tizedVectorsReader.java:121)
> at
[email protected]/org.apache.lucene.codecs.lucene99.Lucene99ScalarQuantizedVectorsReader.<init>(Lucene99ScalarQuantize\
dVectorsReader.java:90)
> at
[email protected]/org.apache.lucene.codecs.lucene99.Lucene99ScalarQuantizedVectorsFormat.fieldsReader(Lucene99ScalarQu\
antizedVectorsFormat.java:160)
> at
[email protected]/org.apache.lucene.codecs.lucene99.Lucene99HnswScalarQuantizedVectorsFormat.fieldsReader(Lucene99Hnsw\
ScalarQuantizedVectorsFormat.java:155)
> at
[email protected]/org.apache.lucene.codecs.perfield.PerFieldKnnVectorsFormat$FieldsReader.<init>(PerFieldKnnVectorsFor\
mat.java:222)
> at
[email protected]/org.apache.lucene.codecs.perfield.PerFieldKnnVectorsFormat.fieldsReader(PerFieldKnnVectorsFormat.jav\
...
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]