Re: Scaling

Karl Wettin Fri, 18 Jul 2008 06:34:01 -0700


18 jul 2008 kl. 09.49 skrev Eric Bowman:

One thing I have trouble understanding is how scoring works in thiscase. Does Lucene really "just work", or are there special thingswe have to do to make sure that the scores are coherent so we canactually decide which was the best match? What kind of constraintsare there when breaking up the index into parts to make sure scoringremains coherent?

AFAIK the score would suffer from splitting up the index as tf/idfthen only represent a part of the index, i.e. two identical docmentsin two indices would end up with different scores as the index metadata is different. I have no clue how large the impact could be nor ifthere are good and bad ways to split an index.

One solution I can think of is to share complete index over all nodesbut restrict the results from each node to a subset of the index usinga filter. This should produce the right score but will probably be abit slower than splitting the index.

Perhaps it would be possible to split the index for searching but usean alternative source for scoring.



          karl

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Scaling

Reply via email to