[ 
https://issues.apache.org/jira/browse/LUCENE-9541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204564#comment-17204564
 ] 

Adrien Grand commented on LUCENE-9541:
--------------------------------------

In my opinion this is a bug in how ConjunctionDISI was used, not a bug in 
BitSetConjunctionDISI.

Conjunction iterators maintain the invariant that between two calls to nextDoc 
or advance, all sub iterators are on the same doc ID. If we advance one of the 
subs without making ConjunctionDISI aware of it, then we break this invariant. 
We found this with BitSetConjunctionDISI but this would also be a problem with 
ConjunctionDISI.

To take your example of a and b both matching [0,1], if you create a 
ConjunctionDISI over both iterators and a is picked as the lead, then advance b 
to 1 and finally call nextDoc() on the ConjunctionDISI, then it will first call 
nextDoc() on a, which returns 0, and then advance(0) on b which is illegal 
since it's illegal to call advance with a target that is less than or equal to 
the current doc ID.


> BitSetConjunctionDISI doesn't advance based on its components
> -------------------------------------------------------------
>
>                 Key: LUCENE-9541
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9541
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Mayya Sharipova
>            Priority: Minor
>
> Not completely sure if this is a bug.
> BitSetConjuctionDISI advances based on its lead  – DocIdSetIterator iterator, 
> and doesn't consider that its another component – BitSetIterator may have 
> already advanced passed a certain doc. This may result in duplicate documents.
> For example if BitSetConjuctionDISI  _disi_ is composed of DocIdSetIterator 
> _a_ of docs  [0,1] and BitSetIterator _b_ of docs [0,1].  Doing `b.nextDoc()` 
> we are collecting doc0,  doing `disi.nextDoc` we again  collecting the same 
> doc0.
> It seems that other conjunction iterators don't have this behaviour, if we 
> are advancing any of their component pass a certain document, the whole 
> conjunction iterator will also be advanced pass this document. 
>  
> This behaviour was exposed in this 
> [PR|https://github.com/apache/lucene-solr/pull/1903]. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to