This algo is strictly tied to sort-by-score, if I understand it correctly. Lucene has queries and sorting decoupled (except for allowOutOfOrder mess), so implementing it would require some really fat hacks.
On Mon, Nov 16, 2009 at 20:26, J. Delgado <joaquin.delg...@gmail.com> wrote: > As I understood it setMinimumNumberShouldMatch(int min) Is used to > specify a minimum number of the optional BooleanClauses which must be > satisfied. > > I haven't seen the implementation of setMinimumNumberShouldMatch but > it seems a bit different than what is intended with the WAND operator, > which can take any real number as threshold θ > > As stated in the paper: > > WAND(X1,w1, . . . Xk,wk, θ) is true iff X 1≤i≤k and SUM(xiwi)≥ θ > > where xi is the indicator variable for Xi, that is xi = 1, if Xi is > true 0, otherwise. > > Observe that WAND can be used to implement AND > and OR via > AND(X1,X2, . . .Xk) ≡ WAND(X1, 1,X2, 1, . . . Xk, 1, k), > and > OR(X1,X2, . ..Xk) ≡ WAND(X1, 1,X2, 1, . ..Xk, 1, 1). > > What I find interesting is the idea of using a first pass using the > upper bound (maximal) contribution of a term on any document score and > the dynamic setting of the threshold θ to skip or to fully evaluate a > document.. > > As stated in the paper: > > "Given this setup our preliminary scoring consists of evaluating > for each document d > WAND(X1,UB1,X2,UB2, . . .,Xk,UBk, θ), > where Xi is an indicator variable for the presence of query term i in > document d and the threshold θ is varied during > the algorithm as explained below. If WAND evaluates to true, then the > document d undergoes a full evaluation. > The threshold θ is set dynamically by the algorithm based on the > minimum score m among the top n results found so > far, where n is the number of requested documents. The larger the > threshold, the more documents will be skipped > and thus we will need to compute full scores for fewer documents." > > I think its worth a try... > > -- Joaquin > > On Mon, Nov 16, 2009 at 2:54 AM, Andrzej Bialecki <a...@getopt.org> wrote: >> >> J. Delgado wrote: >>> >>> Here is the link to the paper. >>> http://cis.poly.edu/westlab/papers/cntdstrb/p426-broder.pdf >>> >>> A more recent application of the use and extension of the WAND operator for >>> indexing of Boolean expressions: >>> http://ilpubs.stanford.edu:8090/927/2/wand_vldb.pdf >>> >>> -- Joaquin >>> >>> >>> On Sun, Nov 15, 2009 at 11:12 PM, Shalin Shekhar Mangar < >>> shalinman...@gmail.com> wrote: >>> >>>> Hey Joaquin, >>>> >>>> The mailing list strips off attachments. Can you please upload it somewhere >>>> and give us the link? >>>> >>>> On Mon, Nov 16, 2009 at 12:35 PM, J. Delgado <joaquin.delg...@gmail.com >>>>> >>>>> wrote: >>>>> Please find attached the paper on "Efficient Query Evaluation using a >>>>> Two-Level Retrieval Process". I believe that such approach may improve >>>> >>>> the >>>>> >>>>> way Lucene/Solr evaluates queries today. >> >> The functionality of WAND (weak AND) is already implemented in Lucene, if I >> understand it correctly - this is the BooleanQuery.setMinShouldMatch(int). >> Lucene implements this probably differently from the algorithm described in >> the paper, so there may be still some benefits from comparing the algorithms >> in Lucene's BooleanScorer[2] with this one ... >> >> >> -- >> Best regards, >> Andrzej Bialecki <>< >> ___. ___ ___ ___ _ _ __________________________________ >> [__ || __|__/|__||\/| Information Retrieval, Semantic Web >> ___|||__|| \| || | Embedded Unix, System Integration >> http://www.sigram.com Contact: info at sigram dot com >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-dev-h...@lucene.apache.org > > -- Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com) Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423 ICQ: 104465785 --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org