On Wednesday 27 February 2008 03:33:53 Itamar Syn-Hershko wrote:
> I'm still trying to engineer the best possible solution for Lucene with
> Hebrew, right now my path is NOT using a stemmer by default, only by
> explicit request of the user. MoreLikeThis would only return relevant
> results if I will use a non-stemmed scoring and lookup.

This appears to be the case for all languages too, the stemming will skew 
similarity and result in unrelated documents scoring higher than they need 
to.

Some people seem to be working around this by having two fields where one is 
stemmed and the other isn't.  You could then use the stemmed field when doing 
queries but use the non-stemmed field for MoreLikeThis.

Daniel

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to