iprithv commented on code in PR #16146:
URL: https://github.com/apache/lucene/pull/16146#discussion_r3320544167


##########
lucene/core/src/java/org/apache/lucene/search/FilterLeafCollector.java:
##########
@@ -42,6 +46,16 @@ public void collect(int doc) throws IOException {
     in.collect(doc);
   }
 
+  @Override

Review Comment:
   `FilterLeafCollector` is meant to let subclasses override `collect(int)` and 
handle each doc. but with these new methods (`collect(DocIdStream)` and 
`collectRange`), it just forwards everything directly to `in`. that means the 
subclass’s `collect(int)` never gets called. so any existing subclass that only 
overrides `collect(int)` will silently break when bulk collection is used.
   
   I see javadoc saying subclasses should override the new methods too, but 
that’s not really safe. old code won’t do that, and things will just start 
giving wrong results without any error.
   
   a safer default would be to break the batch calls into per-doc calls, like:
   
   ```java id="safex1"
   @Override
   public void collect(DocIdStream stream) throws IOException {
       stream.forEach(this::collect);
   }
   
   @Override
   public void collectRange(int min, int max) throws IOException {
       for (int doc = min; doc < max; doc++) {
           collect(doc);
       }
   }
   ```
   
   also noticed the subclasses updated already call `this::collect`. but 
`FilterLeafCollector` itself doesn’t, so anything not updated will behave 
incorrectly.
   
   I think this could silently break existing code.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to