At Infoseek, the engine checked the terms in frequency order, with the most rare term first. If the conjunction reached zero matches at any point, it stopped checking.
This might be a related but more general approach. That was almost 30 years ago, so any patents are long-expired. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Oct 1, 2023, at 10:12 AM, Adrien Grand <jpou...@gmail.com> wrote: > > I agree that it would save work in that case, but this query should be very > fast anyway. > > On the other hand, if term1, term2 and term3 have 10M matches each, the > conjunction will need to check if the current candidate match is NO_MORE_DOCS > millions of times even though this would only happen once. > > In general it's better to have less overhead for expensive queries and more > overhead for cheap queries than the other way around. > > Le dim. 1 oct. 2023, 17:35, YouPeng Yang <yypvsxf19870...@gmail.com > <mailto:yypvsxf19870...@gmail.com>> a écrit : >> Hi Adrien >> suppose that conjunction query like (term1 AND term2 AND term3) ,if the >> term1 does not exist ,and then the loop execution may cause unnecessary >> overhead.(sorry I have not yet find out whether there is any filter work >> before the doNext().. >> >> Best Regard >> >> Adrien Grand <jpou...@gmail.com <mailto:jpou...@gmail.com>> 于2023年10月1日周日 >> 22:30写道: >>> Hello, >>> >>> This change would be correct, but it would only save work when the >>> conjunction is exhausted, and add overhead otherwise? >>> >>> Le sam. 30 sept. 2023, 16:20, YouPeng Yang <yypvsxf19870...@gmail.com >>> <mailto:yypvsxf19870...@gmail.com>> a écrit : >>>> Hi >>>> I am reading the code of class ConjunctionDISI .and about the method >>>> nextDoc , Suppose that the sub-disi is emtpy in the lead1.lead2,should >>>> there be that it can return immediately when the input doc==NO_MORE_DOCS? >>>> >>>> >>>> private int doNext(int doc) throws IOException { >>>> advanceHead: >>>> for (; ; ) { >>>> assert doc == lead1.docID(); >>>> //assumpt doc==NO_MORE_DOCS ,it return >>>> if(doc==NO_MORE_DOCS){ >>>> return NO_MORE_DOCS; >>>> } >>>> // find agreement between the two iterators with the lower costs >>>> // we special case them because they do not need the >>>> // 'other.docID() < doc' check that the 'others' iterators need >>>> final int next2 = lead2.advance(doc); >>>> if (next2 != doc) { >>>> doc = lead1.advance(next2); >>>> if(doc==NO_MORE_DOCS){ >>>> return NO_MORE_DOCS; >>>> } >>>> if (next2 != doc) { >>>> continue; >>>> } >>>> } >>>> ..left omited... >>>> }