benwtrent opened a new pull request, #11843: URL: https://github.com/apache/lucene/pull/11843
PR: https://github.com/apache/lucene/pull/833 helpfully introduced query cancellation checks for KNN vector queries. However, checking for cancellation on every vector read has a negative impact on performance. This change proposes that we no longer check on every vector. This performance hit was noticed first in Elasticsearch benching nightlies: - related PRs: [Initial Addition](https://github.com/elastic/elasticsearch/pull/90612), [Fix](https://github.com/elastic/elasticsearch/pull/90804) - Performance numbers: https://elasticsearch-benchmarks.elastic.co/#tracks/dense_vector/nightly/default/90d (notice `nightly-dense_vector-add-4g-1node-script-score-query-latency`) Lucene benches indicate no dramatic change in VectorSearch around May (when the change was merged...I may be missing where to look). It is common to iterate many vectors (especially since there is currently no early exit mechanism available). Other exitable iterators don't check on every value read and usually sample (see `ExitableTermsEnum` as prior art). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
