Hi, Next to BM25 and TF-IDF, Lucene also privides many more similarity implementations:
https://lucene.apache.org/core/5_4_0/core/org/apache/lucene/search/similarities/LMDirichletSimilarity.html https://lucene.apache.org/core/5_4_0/core/org/apache/lucene/search/similarities/LMJelinekMercerSimilarity.html https://lucene.apache.org/core/5_4_0/core/org/apache/lucene/search/similarities/IBSimilarity.html https://lucene.apache.org/core/5_4_0/core/org/apache/lucene/search/similarities/DFRSimilarity.html If you want to implement your own, choose the closest one and implement the formula as you described. I'll start with SimilarityBase, which is ideal base class for such types like Dirichlet / DFR /..., because it has a default implementation for stuff like phrases. > LMDiricletbut its feasibilit I am not sure what you want to say with this mistyped sentence fragment. Uwe ----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: [email protected] > -----Original Message----- > From: Jack Krupansky [mailto:[email protected]] > Sent: Monday, December 14, 2015 11:21 PM > To: [email protected] > Subject: Re: Jensen–Shannon divergence > > Is there any particular reason that you find Lucene's builtin TF/IDF and > BM25 similarity models insufficient for your needs? In any case, > examination of their source code should get you started if you with to do > your own: > > https://lucene.apache.org/core/5_3_0/core/org/apache/lucene/search/simi > larities/TFIDFSimilarity.html > https://lucene.apache.org/core/5_3_0/core/org/apache/lucene/search/simi > larities/BM25Similarity.html > > -- Jack Krupansky > > On Sun, Dec 13, 2015 at 8:30 AM, Shay Hummel <[email protected]> > wrote: > > > Hi > > > > I need help to implement similarity between query model and document > model. > > I would like to use the JS-Divergence > > <https://en.wikipedia.org/wiki/Jensen%E2%80%93Shannon_divergence> > for > > ranking documents. The documents and the query will be represented > > according to the language models approach - specifically the LMDiriclet. > > The similarity will be calculated using the JS-Div between the document > > model and the query model. > > Is it possible? > > if so how? > > > > Thank you, > > Shay Hummel > > -- > > Regards, > > Shay Hummel > > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
