I agree that it would save work in that case, but this query should be very
fast anyway.

On the other hand, if term1, term2 and term3 have 10M matches each, the
conjunction will need to check if the current candidate match is
NO_MORE_DOCS millions of times even though this would only happen once.

In general it's better to have less overhead for expensive queries and more
overhead for cheap queries than the other way around.

Le dim. 1 oct. 2023, 17:35, YouPeng Yang <yypvsxf19870...@gmail.com> a
écrit :

> Hi Adrien
>     suppose that conjunction query like  (term1 AND term2 AND term3) ,if
> the term1 does not exist ,and then the loop execution may cause
> unnecessary  overhead.(sorry I have not yet find out whether there is any
> filter work before the doNext()..
>
> Best Regard
>
> Adrien Grand <jpou...@gmail.com> 于2023年10月1日周日 22:30写道:
>
>> Hello,
>>
>> This change would be correct, but it would only save work when the
>> conjunction is exhausted, and add overhead otherwise?
>>
>> Le sam. 30 sept. 2023, 16:20, YouPeng Yang <yypvsxf19870...@gmail.com> a
>> écrit :
>>
>>> Hi
>>>   I am reading the code of class ConjunctionDISI .and about the method
>>> nextDoc ,  Suppose that the sub-disi is emtpy in the lead1.lead2,should
>>> there be that it can return immediately when the input doc==NO_MORE_DOCS?
>>>
>>>
>>> private int doNext(int doc) throws IOException {
>>> advanceHead:
>>> for (; ; ) {
>>> assert doc == lead1.docID();
>>> //assumpt doc==NO_MORE_DOCS ,it return
>>> if(doc==NO_MORE_DOCS){
>>> return NO_MORE_DOCS;
>>> }
>>> // find agreement between the two iterators with the lower costs
>>> // we special case them because they do not need the
>>> // 'other.docID() < doc' check that the 'others' iterators need
>>> final int next2 = lead2.advance(doc);
>>> if (next2 != doc) {
>>> doc = lead1.advance(next2);
>>> if(doc==NO_MORE_DOCS){
>>> return NO_MORE_DOCS;
>>> }
>>> if (next2 != doc) {
>>> continue;
>>> }
>>> }
>>> ..left omited...
>>> }
>>>
>>

Reply via email to