jmazanec15 commented on code in PR #1068:
URL: https://github.com/apache/lucene/pull/1068#discussion_r966298807
##########
lucene/core/src/java/org/apache/lucene/search/ConjunctionDISI.java:
##########
@@ -281,6 +281,12 @@ private int doNext(int doc) throws IOException {
advanceLead:
for (; ; doc = lead.nextDoc()) {
if (doc >= minLength) {
+ if (doc != NO_MORE_DOCS) {
+ lead.advance(NO_MORE_DOCS);
+ }
+ for (BitSetIterator iterator : bitSetIterators) {
+ iterator.setDocId(NO_MORE_DOCS);
+ }
Review Comment:
> The if statement makes sense to me, but I'm curious how you managed to hit
this case. This suggests that we create BitSets whose size is not maxDoc, do
you know where this happens?
I think I might be misunderstanding the question. Each bitsetiterator could
have a different length of bitset, potentially as an optimization
([minLength](https://github.com/apache/lucene/blob/branch_9_4/lucene/core/src/java/org/apache/lucene/search/ConjunctionDISI.java#L256)
I think suggests this is expected). If a bitsetiterator's top match is 10 and
there are 1M docs in the index, I think there was no reason to store 1M bits -
the bitsetiterator can just exhaust after 10.
> The for loop should be unnecessary, there is no guarantee that all sub
iterators advance to NO_MORE_DOCS. If this causes problems, then it means we
have another bug somewhere else?
Agree this is probably unnecessary. I added it to ensure that [this
statement](https://github.com/apache/lucene/blob/branch_9_4/lucene/core/src/java/org/apache/lucene/search/ConjunctionDISI.java#L31-L32)
holds: "Requires that all of its sub-iterators must be on the same document
all the time."
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]