gashutos opened a new issue, #13313: URL: https://github.com/apache/lucene/issues/13313
### Description ### Background Lucene sort queries are using skipping logic for faster execution and skip non-competitive documents by updating its competitive iterator whenever it updates its bottom value in priority queue. [Reference](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/comparators/NumericComparator.java#L205C1-L206C1) This works for fields indexed with BKD. In case of `after` value, `after` value is considerd as topValue we try to execute skipping logic with both `topValue` and `afterValue`. The skipping logic has a [constraint](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/comparators/NumericComparator.java#L274), and with that, the estimated number of competitive documents must reduce to `1/8` to intersect BKD skipping logic. That can be problamatic in like described below. ### Problem With #12333 change, we started using `topValue` to skip documents in case `topValue` is able to skip `7/8` number of documents. But what if we only able to skip `6/8` number of documents with `topvalue` and those all non-competitive documents `6/8` are ahead of `after` document in docsIdIterator ? This problem specifically comes in `timeseries` workload. i.e. we have time series segment where documents ids and document values are in nearly sorted order and lets assume we have `100` documents and its time field values are from 1,2,3,....100. Now if I trigger sort query `sort` on this field with `after` values as `87`, we wont able to skip first 86 documents and they will end up being in comparison here in [PagingFieldCollector](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/TopFieldCollector.java#L274). Assume this happening for millions of documents in case of time series workload. ### Proposed solution Lets invoke skipping logic in case `topValue` is known but `bottomvalue` is unknown ir-respective of number of docuements we are able to skip. That will be one time invocation of skipping logic in case `after` value is specific and we will skip all documents which are non-competitive w.r.t after value. And from next iteration, we will know about `bottomValue` once priority queue is full and can have constraint of 7/8 documents skipping..... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org