Hi Tomislav, > if I understand correctly, you are suggesting query execution in two > phases: first execute query on whole article index core (where whole > articles are indexed, but not stored) to get article IDs (for articles > which match original query). Then for each match in article core: > change the AND operators from the original query to OR and add > articleID condition/filter and execute such query on sentence based > index (with assumption each sentence based doc has articleID set).
Yes. > Is this correct and it this what is "you'll want to run the second > stage once for each hit from the first stage, though" referring to? > > Example for this scenario would be for original query "q=apples and > oranges", execute "q=apples and orange" with fl=articleId on article > core and for each articleIdX result execute "q=(apples OR orange) AND > articleId:articleIdX" on sentence based core. > > Same thing (with the same results) should be doable with only a single > query in second phase, for previous example that single query for > second phase would be for all articleId1,...,articleIdN something > like: > > q=((apples OR orange) AND articleId:articleId1) OR ((apples OR orange) > AND articleId:articleId2) OR ... OR apples OR orange) AND > articleId:articleIdN) > > But, here in second case results are ordered by sentence scoring > instead of article and reslts should be re-ordered. Is this what is > "unless you can afford to collect *all* hits and pull out each first > stage's hit from the intermixed second stage results" refering to? Yes. > My actual question after this really long intro is: couldn't this be > done with single second level query approach, but on each topN > start/row chunk as user iterates through first level results? > > For example, user executes query "q=apples and oranges" and this > results in 1000 results, but first page display only for example 20 > results which means proposed solution would: > > 1. phase: execute execute "q=apples and orange" with fl=articleId on > article core, but with start=0&rows=20 > 2. phase: q=((apples OR orange) AND articleId:articleId1) OR ((apples > OR orange) AND articleId:articleId2) OR ... OR apples OR orange) AND > articleId:articleId20) > 3. Reorder sentence results to match order defined by article matching > scores and return to user > > Only, the results here would need to be collapsed on unique articleID, > so only 20 results are provided in result set (because multiple > "sentence based doc" can be returned for a single unique articleID) > > Would this work? I think so, but I don't have any experience using collapsing, so I can't say for sure. BTW, Otis' rearrangement of your phase #2 would also work, and would be theoretically faster to evaluate: q=+(apples orange) +articleId:(articleId1 ... articleId20) Steve