pminkov opened a new pull request, #939:
URL: https://github.com/apache/lucene/pull/939

   ### Description
   
   MoreLikeThis picks terms by their TF-IDF score. The TF part of the score was 
used by taking the term frequency directly, without applying a square root 
through ClassicSimilarity.tf(). The result of this is that how common a term is 
in an input can have too much weight on whether it's selected as a search term. 
This can make more stop words make their way into the final query.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to