This is a good approach indeed, Lucene does this too.

Le dim. 1 oct. 2023, 19:33, Walter Underwood <wun...@wunderwood.org> a
écrit :

> At Infoseek, the engine checked the terms in frequency order, with the
> most rare term first. If the conjunction reached zero matches at any point,
> it stopped checking.
>
> This might be a related but more general approach.
>
> That was almost 30 years ago, so any patents are long-expired.
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
> On Oct 1, 2023, at 10:12 AM, Adrien Grand <jpou...@gmail.com> wrote:
>
> I agree that it would save work in that case, but this query should be
> very fast anyway.
>
> On the other hand, if term1, term2 and term3 have 10M matches each, the
> conjunction will need to check if the current candidate match is
> NO_MORE_DOCS millions of times even though this would only happen once.
>
> In general it's better to have less overhead for expensive queries and
> more overhead for cheap queries than the other way around.
>
> Le dim. 1 oct. 2023, 17:35, YouPeng Yang <yypvsxf19870...@gmail.com> a
> écrit :
>
>> Hi Adrien
>>     suppose that conjunction query like  (term1 AND term2 AND term3) ,if
>> the term1 does not exist ,and then the loop execution may cause
>> unnecessary  overhead.(sorry I have not yet find out whether there is any
>> filter work before the doNext()..
>>
>> Best Regard
>>
>> Adrien Grand <jpou...@gmail.com> 于2023年10月1日周日 22:30写道:
>>
>>> Hello,
>>>
>>> This change would be correct, but it would only save work when the
>>> conjunction is exhausted, and add overhead otherwise?
>>>
>>> Le sam. 30 sept. 2023, 16:20, YouPeng Yang <yypvsxf19870...@gmail.com>
>>> a écrit :
>>>
>>>> Hi
>>>>   I am reading the code of class ConjunctionDISI .and about the method
>>>> nextDoc ,  Suppose that the sub-disi is emtpy in the lead1.lead2,should
>>>> there be that it can return immediately when the input doc==NO_MORE_DOCS?
>>>>
>>>>
>>>> private int doNext(int doc) throws IOException {
>>>> advanceHead:
>>>> for (; ; ) {
>>>> assert doc == lead1.docID();
>>>> //assumpt doc==NO_MORE_DOCS ,it return
>>>> if(doc==NO_MORE_DOCS){
>>>> return NO_MORE_DOCS;
>>>> }
>>>> // find agreement between the two iterators with the lower costs
>>>> // we special case them because they do not need the
>>>> // 'other.docID() < doc' check that the 'others' iterators need
>>>> final int next2 = lead2.advance(doc);
>>>> if (next2 != doc) {
>>>> doc = lead1.advance(next2);
>>>> if(doc==NO_MORE_DOCS){
>>>> return NO_MORE_DOCS;
>>>> }
>>>> if (next2 != doc) {
>>>> continue;
>>>> }
>>>> }
>>>> ..left omited...
>>>> }
>>>>
>>>
>

Reply via email to