Re: MoreLikeThis (mlt) : use the match's maxScore for result score normalization
Hi Mitch, thanks for the answer and the link. The use case is to provide content based recommendations for a single item no matter where that came from. So, this input (match) item is the best match, all more like this items compare to it, and the ones that are the most alike would have the highest scores. (Meaning also that the most similar are probably not as good as recommendations because they are too similar. But that is a different story.) Again, I don't want to compare the scores of regular search results (e.g. from dismax) with those of mlt. I only want a way to show to the user a kind of relevancy or similarity indicator (for example using a range of 10 stars) that would give a hint on how similar the mlt hit is to the input (match) item. Greetings from Munich ;-) Chantal On Thu, 2010-06-24 at 17:06 +0200, MitchK wrote: Chantal, have a look at http://lucene.apache.org/java/3_0_1/api/all/org/apache/lucene/search/similar/MoreLikeThis.html More like this to have a guess what the MLT's score concerns. The problem is that you can't compare scores. The query for the normal result-response was maybe something like Bill Gates featuring Linus Torvald - The perfect OS song. The user picks now one of the responsed documents and says he wants More like this - maybe, because the concerned topic was okay, but the content was not enough or whatever... But the sent query is totaly different (as you can see in the link) - so that would be like comparing apples and oranges, since they do not use the same base. What would be the use case? Why is score-normalization needed? Kind regards from Germany, - Mitch
Re: MoreLikeThis (mlt) : use the match's maxScore for result score normalization
Hi Chantal, Munich? Germany seems to be soo small :-). Chantal Ackermann wrote: I only want a way to show to the user a kind of relevancy or similarity indicator (for example using a range of 10 stars) that would give a hint on how similar the mlt hit is to the input (match) item. Okay, that's making more sense. Unfortunately, you can not do that with Lucene with results that might fit your needs (as far as I know). Kind regards - Mitch -- View this message in context: http://lucene.472066.n3.nabble.com/MoreLikeThis-mlt-use-the-match-s-maxScore-for-result-score-normalization-tp919598p921942.html Sent from the Solr - User mailing list archive at Nabble.com.
MoreLikeThis (mlt) : use the match's maxScore for result score normalization
Hi there, consider the following response extract for a MoreLikeThis request: result name=match numFound=1 start=0 maxScore=13.4579935 result name=response numFound=103708 start=0 maxScore=4.1711807 The first result element is the document that was input and for which to return more like this results. The second result element contains the results returned by the handler. As they both come with a different maxScore I was wondering whether I could safely use the match's maxScore to normalize the scores of the more like this documents. Would that allow to reflect to the user the quality/relevancy of the hits for different MoreLikeThis requests (and only those)? (What does the match's maxScore mean?) Thanks! Chantal
Re: MoreLikeThis (mlt) : use the match's maxScore for result score normalization
Hi Otis, thank you for this super quick answer. I understand that normalizing and comparing scores is fishy, and I wouldn't want to do it for regular search results. I just thought that in this special case, the maxScore which is returned for the input document to the MoreLikeThis handler -- and this is only present in MoreLikeThis responses (with include=true) -- might be the missing additional value that would allow to normalize on. (In this special case there are two maxScores.) But I don't know what the match's maxScore is derived from. As the input element should surely be the best match for the request a maxScore of 13.4579935 looks suspicious? Thanks, Chantal On Thu, 2010-06-24 at 16:25 +0200, Otis Gospodnetic wrote: Chantal, The short answer is that you can't compare relevancy scores across requests. I think this may be in a FAQ. Check this: http://search-lucene.com/?q=score+compare+absolute+relativefc_project=Lucenefc_project=Solr Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ - Original Message From: Chantal Ackermann chantal.ackerm...@btelligent.de To: solr-user@lucene.apache.org solr-user@lucene.apache.org Sent: Thu, June 24, 2010 10:17:57 AM Subject: MoreLikeThis (mlt) : use the match's maxScore for result score normalization Hi there, consider the following response extract for a MoreLikeThis request: result name=match numFound=1 start=0 maxScore=13.4579935 result name=response numFound=103708 start=0 maxScore=4.1711807 The first result element is the document that was input and for which to return more like this results. The second result element contains the results returned by the handler. As they both come with a different maxScore I was wondering whether I could safely use the match's maxScore to normalize the scores of the more like this documents. Would that allow to reflect to the user the quality/relevancy of the hits for different MoreLikeThis requests (and only those)? (What does the match's maxScore mean?) Thanks! Chantal
Re: MoreLikeThis (mlt) : use the match's maxScore for result score normalization
Chantal, have a look at http://lucene.apache.org/java/3_0_1/api/all/org/apache/lucene/search/similar/MoreLikeThis.html More like this to have a guess what the MLT's score concerns. The problem is that you can't compare scores. The query for the normal result-response was maybe something like Bill Gates featuring Linus Torvald - The perfect OS song. The user picks now one of the responsed documents and says he wants More like this - maybe, because the concerned topic was okay, but the content was not enough or whatever... But the sent query is totaly different (as you can see in the link) - so that would be like comparing apples and oranges, since they do not use the same base. What would be the use case? Why is score-normalization needed? Kind regards from Germany, - Mitch -- View this message in context: http://lucene.472066.n3.nabble.com/MoreLikeThis-mlt-use-the-match-s-maxScore-for-result-score-normalization-tp919598p919716.html Sent from the Solr - User mailing list archive at Nabble.com.