Re: MoreLikeThis (mlt) : use the match's maxScore for result score normalization

2010-06-25 Thread Chantal Ackermann
Hi Mitch,

thanks for the answer and the link.

The use case is to provide content based recommendations for a single
item no matter where that came from. So, this input (match) item is the
best match, all more like this items compare to it, and the ones that
are the most alike would have the highest scores.

(Meaning also that the most similar are probably not as good as
recommendations because they are too similar. But that is a different
story.)

Again, I don't want to compare the scores of regular search results
(e.g. from dismax) with those of mlt. I only want a way to show to the
user a kind of relevancy or similarity indicator (for example using a
range of 10 stars) that would give a hint on how similar the mlt hit is
to the input (match) item.

Greetings from Munich ;-)
Chantal



On Thu, 2010-06-24 at 17:06 +0200, MitchK wrote:
 Chantal,
 
 have a look at 
 http://lucene.apache.org/java/3_0_1/api/all/org/apache/lucene/search/similar/MoreLikeThis.html
 More like this  to have a guess what the MLT's score concerns.
 
 The problem is that you can't compare scores.
 The query for the normal result-response was maybe something like 
 Bill Gates featuring Linus Torvald - The perfect OS song.
 The user picks now one of the responsed documents and says he wants More
 like this - maybe, because the concerned topic was okay, but the content
 was not enough or whatever...
 But the sent query is totaly different (as you can see in the link) - so
 that would be like comparing apples and oranges, since they do not use the
 same base.
 
 What would be the use case? Why is score-normalization needed?
 
 Kind regards from Germany,
 - Mitch





Re: MoreLikeThis (mlt) : use the match's maxScore for result score normalization

2010-06-25 Thread MitchK

Hi Chantal,

Munich? Germany seems to be soo small :-).


Chantal Ackermann wrote:
 
 I only want a way to show to the 
 user a kind of relevancy or similarity indicator (for example using a 
 range of 10 stars) that would give a hint on how similar the mlt hit is 
 to the input (match) item. 
 
Okay, that's making more sense.
Unfortunately, you can not do that with Lucene with results that might fit
your needs (as far as I know).

Kind regards
- Mitch
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/MoreLikeThis-mlt-use-the-match-s-maxScore-for-result-score-normalization-tp919598p921942.html
Sent from the Solr - User mailing list archive at Nabble.com.


MoreLikeThis (mlt) : use the match's maxScore for result score normalization

2010-06-24 Thread Chantal Ackermann
Hi there,

consider the following response extract for a MoreLikeThis request:

result name=match numFound=1 start=0 maxScore=13.4579935
result name=response numFound=103708 start=0
maxScore=4.1711807

The first result element is the document that was input and for which to
return more like this results.
The second result element contains the results returned by the handler.

As they both come with a different maxScore I was wondering whether I
could safely use the match's maxScore to normalize the scores of the
more like this documents.

Would that allow to reflect to the user the quality/relevancy of the
hits for different MoreLikeThis requests (and only those)?
(What does the match's maxScore mean?)

Thanks!
Chantal



Re: MoreLikeThis (mlt) : use the match's maxScore for result score normalization

2010-06-24 Thread Chantal Ackermann
Hi Otis,

thank you for this super quick answer. I understand that normalizing and
comparing scores is fishy, and I wouldn't want to do it for regular
search results.

I just thought that in this special case, the maxScore which is returned
for the input document to the MoreLikeThis handler -- and this is only
present in MoreLikeThis responses (with include=true) -- might be the
missing additional value that would allow to normalize on. (In this
special case there are two maxScores.)

But I don't know what the match's maxScore is derived from. As the input
element should surely be the best match for the request a maxScore of
13.4579935 looks suspicious?

Thanks,
Chantal




On Thu, 2010-06-24 at 16:25 +0200, Otis Gospodnetic wrote:
 Chantal,
 
 The short answer is that you can't compare relevancy scores across requests.  
 I think this may be in a FAQ.
 Check this:
 http://search-lucene.com/?q=score+compare+absolute+relativefc_project=Lucenefc_project=Solr
 
 Otis
 
 Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
 Lucene ecosystem search :: http://search-lucene.com/
 
 
 
 - Original Message 
  From: Chantal Ackermann chantal.ackerm...@btelligent.de
  To: solr-user@lucene.apache.org solr-user@lucene.apache.org
  Sent: Thu, June 24, 2010 10:17:57 AM
  Subject: MoreLikeThis (mlt) : use the match's maxScore for result score 
  normalization
  
  Hi there,
 
 consider the following response extract for a MoreLikeThis 
  request:
 
 result name=match numFound=1 start=0 
  maxScore=13.4579935
 result name=response numFound=103708 
  start=0
 maxScore=4.1711807
 
 The first result element is the 
  document that was input and for which to
 return more like this 
  results.
 The second result element contains the results returned by the 
  handler.
 
 As they both come with a different maxScore I was wondering 
  whether I
 could safely use the match's maxScore to normalize the scores of 
  the
 more like this documents.
 
 Would that allow to reflect to the 
  user the quality/relevancy of the
 hits for different MoreLikeThis requests 
  (and only those)?
 (What does the match's maxScore 
  mean?)
 
 Thanks!
 Chantal





Re: MoreLikeThis (mlt) : use the match's maxScore for result score normalization

2010-06-24 Thread MitchK

Chantal,

have a look at 
http://lucene.apache.org/java/3_0_1/api/all/org/apache/lucene/search/similar/MoreLikeThis.html
More like this  to have a guess what the MLT's score concerns.

The problem is that you can't compare scores.
The query for the normal result-response was maybe something like 
Bill Gates featuring Linus Torvald - The perfect OS song.
The user picks now one of the responsed documents and says he wants More
like this - maybe, because the concerned topic was okay, but the content
was not enough or whatever...
But the sent query is totaly different (as you can see in the link) - so
that would be like comparing apples and oranges, since they do not use the
same base.

What would be the use case? Why is score-normalization needed?

Kind regards from Germany,
- Mitch
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/MoreLikeThis-mlt-use-the-match-s-maxScore-for-result-score-normalization-tp919598p919716.html
Sent from the Solr - User mailing list archive at Nabble.com.