pminkov opened a new pull request, #940:
URL: https://github.com/apache/lucene/pull/940

   ### Description
   
   MoreLikeThis picks terms by their TF-IDF score. The TF part of the score was 
used by taking the term frequency directly, without applying a square root 
through ClassicSimilarity.tf(). The result is that how common a term is in an 
input can have too much weight on whether it's selected as a search term. An 
example of a negative effect is that this can make more stop words make their 
way into the final query.
   
   ### Tests
   
   Ran MoreLikeThis tests with:
   ```commandline
   ./gradlew -p lucene/queries test --tests TestMoreLikeThis
   ```
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to