Hi - MoreLikeThis is not based on cosine similarity. The idea is that rare 
terms - high IDF - are extracted from the source document, and then used to 
build a regular Query(). That query follows the same rules as regular queries, 
the rules of your similarity implemenation, which is TFIDF by default. So, as 
suggested, if you enable debugging, you can clearly see why scores can be above 
1, or even much higher if queryNorm is disabled when using BM25 as similarity.

If you really need cosine similarity between documents, you have to enable term 
vectors for the source fields, and use them to calculate the angle. The problem 
is that this does not scale well, you would need to calculate angles with 
virtually all other documents.

M.
 
-----Original message-----
> From:Ali Nazemian <alinazem...@gmail.com>
> Sent: Monday 2nd February 2015 21:39
> To: solr-user@lucene.apache.org
> Subject: Re: Lucene cosine similarity score for more like this query
> 
> Dear Erik,
> Thank you for your response. Would younplease tell me why this score could
> be higher than 1? While cosine similarity can not be higher than 1.
> On Feb 2, 2015 7:32 PM, "Erik Hatcher" <erik.hatc...@gmail.com> wrote:
> 
> > The scoring is the same as Lucene.  To get deeper insight into how a score
> > is computed, use Solr’s debug=true mode to see the explain details in the
> > response.
> >
> >         Erik
> >
> > > On Feb 2, 2015, at 10:49 AM, Ali Nazemian <alinazem...@gmail.com> wrote:
> > >
> > > Hi,
> > > I was wondering what is the range of score is brought by more like this
> > > query in Solr? I know that the Lucene uses cosine similarity in vector
> > > space model for calculating similarity between two documents. I also know
> > > that cosine similarity is between -1 and 1 but the fact that I dont
> > > understand is why the score which is brought by more like this query
> > could
> > > be "12" for example?! Would you please explain what is the calculation
> > > process is Solr?
> > > Thank you very much.
> > >
> > > Best regards.
> > >
> > > --
> > > A.Nazemian
> >
> >
> 

Reply via email to