pminkov opened a new pull request, #939: URL: https://github.com/apache/lucene/pull/939
### Description MoreLikeThis picks terms by their TF-IDF score. The TF part of the score was used by taking the term frequency directly, without applying a square root through ClassicSimilarity.tf(). The result of this is that how common a term is in an input can have too much weight on whether it's selected as a search term. This can make more stop words make their way into the final query. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org