You could extend this class and provide your own implementation to incorporate term frequency into the final score. For the record, you might want to look into BM25Similarity, which takes term frequency into account, but in a way that gives a much lower score contribution to hits than ClassicSimilarity. More generally, BM25Similarity is considered a superior alternative to ClassicSimilarity (the canonical implementation of TFIDFSimilarity that you linked).
Le mar. 17 juil. 2018 à 19:04, <[email protected]> a écrit : > i forgot to put the doc that i was referring to: > > > https://lucene.apache.org/core/6_0_1/core/org/apache/lucene/search/similarities/TFIDFSimilarity.html > > Best regards > > > On 7/17/18 1:01 PM, [email protected] wrote: > > Hi,- > > > > is there a way to diminish the tf(t in d) component to 1? i dont want > > the number of times a word appears to affect the scoring for my app. > > > > Best regards > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [email protected] > > For additional commands, e-mail: [email protected] > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > >
