[ https://issues.apache.org/jira/browse/LUCENE-5175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13741582#comment-13741582 ]
Tom Burton-West commented on LUCENE-5175: ----------------------------------------- Hi Robert, I tried running luceneutils with the default wikimedium10m collection and tasks. I ran it first on the DefaultSimilarity, which shouldn't be affected by the patch to BM25Similarity and it showed about -2.3% difference. I'm guessing there is some inaccuracy in the tests. When I changed DEFAULT_SIMILARITY to BM25Similarity, the worst change was a difference of -8.8%. Is there a separate mailing list for questions about luceneutils or should I write to the java-dev list? or directly to Mike or you? Tom > Add parameter to lower-bound TF normalization for BM25 (for long documents) > --------------------------------------------------------------------------- > > Key: LUCENE-5175 > URL: https://issues.apache.org/jira/browse/LUCENE-5175 > Project: Lucene - Core > Issue Type: Improvement > Components: core/search > Reporter: Tom Burton-West > Priority: Minor > Attachments: LUCENE-5175.patch > > > In the article "When Documents Are Very Long, BM25 Fails!" a fix for the > problem is documented. There was a TODO note in BM25Similarity to add this > fix. I will attach a patch that implements the fix shortly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org