jegentile opened a new pull request, #16200: URL: https://github.com/apache/lucene/pull/16200
## Summary - **Precompute cosine query norm**: Byte vector cosine similarity previously recomputed the query vector's squared norm on every `score()` call. Since the query is fixed for the lifetime of a scorer, the norm is now computed once and passed through a new `VectorUtil.cosine(byte[], int, byte[])` overload. This eliminates one SIMD accumulator, one SIMD multiply, and one `reduceLanes` per call in the vectorized (Panama) 256/512-bit paths. - **Keep int4 query compressed**: `CompressedInt4DotProduct` always decompressed the query vector and used `int4DotProductSinglePacked`. The query is now kept in packed form and scored with `int4DotProductBothPacked`, halving query-side memory bandwidth. ## Test plan - [x] `TestVectorUtil` — new tests for `cosine(byte[], int, byte[])` verifying exact float equality with `cosine(byte[], byte[])` for random vectors, plus int4 pack-and-dot-product equivalence - [x] `TestVectorUtilSupport` — cross-provider consistency tests for the new cosine overload across all vector sizes - [x] `TestFlatVectorScorer` — existing parameterized tests cover COSINE for byte vectors - [x] `TestVectorScorer` — existing comparison tests between default and Panama implementations - [x] `TestLucene99ScalarQuantizedVectorScorer` — existing `testScoringCompressedInt4()` covers the packed int4 path - [x] `TestPrefetchableFlatVectorScorer` — no regressions 🤖 Generated with [Claude Code](https://claude.com/claude-code) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
