costin opened a new pull request, #16166:
URL: https://github.com/apache/lucene/pull/16166

   Multi-valued `SortedNumericDocValuesField.newSlowRangeQuery` can currently 
miss Lucene's bitset-based conjunction path, even when the field has a range 
skip index.  The query is exposed as a two-phase iterator, so range checks over 
MAYBE skip blocks still happen doc by doc through `advanceExact`, 
`docValueCount`, and `nextValue`.
   
   This PR adds a scalar bulk range path for sorted numeric doc values. 
   
   The goal is to let skip-indexed multi-valued range queries participate in 
`DenseConjunctionBulkScorer`'s bitset flow before introducing any vectorization.
   
   I tried to mimic the existing shape of `NumericDocValues.rangeIntoBitSet`, 
`BatchDocValuesRangeIterator`, and `SkipBlockRangeIterator`: keep the public 
API small, keep the query wiring narrow, and put the storage-aware fast loop 
inside Lucene90 where flattened sorted-numeric values and addresses are 
available.
   This PR intentionally focuses on the scalar version so we can review and 
agree on the API and iterator behavior first. 
   Once this shape looks right and the direction is confirmed, I plan to follow 
up with a vectorized implementation in a separate PR.
   
   Feedback is welcome!
   
   ## Target queries
   This should mainly help slow doc-values range filters over true multi-valued 
sorted numeric fields when they are combined with other filters and the field 
has a range skip index. 
   The biggest wins are expected when conjunction scoring can use bitset 
windows and the field is dense enough for bulk range evaluation to amortize 
iterator overhead.
   
   ## Benchmarks
   `SortedNumericDocValuesRangeQueryBenchmark`
   
   Platform: JDK 25, 1M docs, `valuesPerDoc=4`, fixed cardinality, dense docs.
   | pattern | selectivity | baseline (ops/s) | candidate (ops/s) | speedup |
   |---|---:|---:|---:|---:|
   | clustered | 0.01 | 4,847 ± 63 | 7,065 ± 56 | 1.46x |
   | clustered | 0.1 | 4,770 ± 73 | 7,347 ± 64 | 1.54x |
   | clustered | 0.5 | 6,254 ± 77 | 18,946 ± 314 | 3.03x |
   | random | 0.01 | 34.5 ± 0.4 | 51.9 ± 0.7 | 1.51x |
   | random | 0.1 | 32.2 ± 0.1 | 45.8 ± 0.1 | 1.42x |
   | random | 0.5 | 36.3 ± 0.5 | 53.9 ± 0.6 | 1.48x |
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to