Hi Tomislav,

> if I understand correctly, you are suggesting query execution in two
> phases: first execute query on whole article index core (where whole
> articles are indexed, but not stored) to get article IDs (for articles
> which match original query).  Then for each match in article core:
> change the AND operators from the original query to OR and add
> articleID condition/filter and execute such query on sentence based
> index (with assumption each sentence based doc has articleID set).

Yes.

> Is this correct and it this what is "you'll want to run the second
> stage once for each hit from the first stage, though" referring to?
> 
> Example for this scenario would be for original query "q=apples and
> oranges", execute "q=apples and orange" with fl=articleId on article
> core and for each articleIdX result execute "q=(apples OR orange) AND
> articleId:articleIdX" on sentence based core.
> 
> Same thing (with the same results) should be doable with only a single
> query in second phase, for previous example that single query for
> second phase would be for all articleId1,...,articleIdN something
> like:
> 
> q=((apples OR orange) AND articleId:articleId1) OR ((apples OR orange)
> AND articleId:articleId2) OR ... OR  apples OR orange) AND
> articleId:articleIdN)
> 
> But, here in second case results are ordered by sentence scoring
> instead of article and reslts should be re-ordered. Is this what is
> "unless you can afford to collect *all* hits and pull out  each first
> stage's hit from the intermixed second stage  results" refering to?

Yes.

> My actual question after this really long intro is: couldn't this be
> done with single second level query approach, but on each topN
> start/row chunk as user iterates through first level results?
> 
> For example, user executes query "q=apples and oranges" and this
> results in 1000 results, but first page display only for example 20
> results which means proposed solution would:
> 
> 1. phase: execute execute "q=apples and orange" with fl=articleId on
> article core, but with start=0&rows=20
> 2. phase: q=((apples OR orange) AND articleId:articleId1) OR ((apples
> OR orange) AND articleId:articleId2) OR ... OR  apples OR orange) AND
> articleId:articleId20)
> 3. Reorder sentence results to match order defined by article matching
> scores and return to user
> 
> Only, the results here would need to be collapsed on unique articleID,
> so only 20 results are provided in result set (because multiple
> "sentence based doc" can be returned for a single unique articleID)
> 
> Would this work?

I think so, but I don't have any experience using collapsing, so I can't say 
for sure.

BTW, Otis' rearrangement of your phase #2 would also work, and would be 
theoretically faster to evaluate: q=+(apples orange) +articleId:(articleId1 ... 
articleId20)

Steve

Reply via email to