sgup432 commented on code in PR #16141:
URL: https://github.com/apache/lucene/pull/16141#discussion_r3314753945


##########
lucene/core/src/java/org/apache/lucene/search/Weight.java:
##########
@@ -300,6 +310,27 @@ private static void scoreIterator(
       }
     }
 
+    private void scoreIteratorIntoBitSet(
+        LeafCollector collector, Bits acceptDocs, DocIdSetIterator iterator, 
int max)
+        throws IOException {
+      for (int doc = iterator.docID(); doc < max; ) {
+        int windowBase = doc;
+        int windowMax = MathUtil.unsignedMin(max, windowBase + WINDOW_SIZE);
+
+        assert windowMatches.scanIsEmpty();
+        iterator.intoBitSet(windowMax, windowMatches, windowBase);
+
+        if (acceptDocs != null) {

Review Comment:
   Should we short circuit here if windowMatches is empty after intoBitSet? We 
could skip both applyMask and collector.collect in that case



##########
lucene/core/src/java/org/apache/lucene/search/Weight.java:
##########
@@ -277,7 +282,12 @@ public int score(LeafCollector collector, Bits acceptDocs, 
int min, int max)
       // iterator implementations.
       if (twoPhase == null && competitiveIterator == null) {
         // Optimize simple iterators with collectors that can't skip
-        scoreIterator(collector, acceptDocs, iterator, max);
+        if (scorer instanceof ConstantScoreScorer constantScoreScorer
+            && constantScoreScorer.canBulkCollectDocIdStream()) {
+          scoreIteratorIntoBitSet(collector, acceptDocs, iterator, max);

Review Comment:
   I wonder if we should this for sparse use-case? As the batching overhead 
might exceed per-doc collection when windows are mostly empty.
   So we can add some density checks like:
   ```
   if (iterator.cost() >= range / 32) {
        // call scoreIteratorIntoBitSet
    } else {
    // otherwise fallback to per doc way
   }
   ```
   Something similar happens in DenseConjunctionBulkScorer.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to