18 jul 2008 kl. 09.49 skrev Eric Bowman:
One thing I have trouble understanding is how scoring works in this
case. Does Lucene really "just work", or are there special things
we have to do to make sure that the scores are coherent so we can
actually decide which was the best match? What kind of constraints
are there when breaking up the index into parts to make sure scoring
remains coherent?
AFAIK the score would suffer from splitting up the index as tf/idf
then only represent a part of the index, i.e. two identical docments
in two indices would end up with different scores as the index meta
data is different. I have no clue how large the impact could be nor if
there are good and bad ways to split an index.
One solution I can think of is to share complete index over all nodes
but restrict the results from each node to a subset of the index using
a filter. This should produce the right score but will probably be a
bit slower than splitting the index.
Perhaps it would be possible to split the index for searching but use
an alternative source for scoring.
karl
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]