Re: How are results merged from a multisearcher?

Ken Krugler Thu, 18 May 2006 10:20:36 -0700

Greetings,

Could someone describe how the results from multiple indices are merged when
using a MultiSearcher? My naive intuition is that the scores for documents
found in each index could be wildly different, so what criteria is used to
merge the scored docs?


I believe they are blindly merged.

Which means that the IDFs for terms between multiple indices must berelatively equal, otherwise the results will be skewed.

The simple approach that most people take when dealing with thisissue is to generate a larger set of smaller indices from the totaldata set, then randomize the selection of indices that get merged toform the N final indices. This randomization helps avoid the IDF skewproblem.

There's an Jira issue on the Nutch side (see NUTCH-92) around thissame problem.


-- Ken
--
Ken Krugler
Krugle, Inc.
+1 530-210-6378
"Find Code, Find Answers"

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: How are results merged from a multisearcher?

Reply via email to