[ https://issues.apache.org/jira/browse/LUCENE-5175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tom Burton-West updated LUCENE-5175: ------------------------------------ Attachment: LUCENE-5175.patch Patch adds optional parameter delta to lower-bound tf normalization. Attached also are unit tests. Still need to add tests of the explanation/scoring for cases 1) no norms, and 2) no delta If no delta parameter is supplied, the math works out to the equivalent of the regular BM25 formula as far as the score, but I think there is an extra step or two to get there. I'll see if I can get some benchmarks running to see if there is any significant performance issue. > Add parameter to lower-bound TF normalization for BM25 (for long documents) > --------------------------------------------------------------------------- > > Key: LUCENE-5175 > URL: https://issues.apache.org/jira/browse/LUCENE-5175 > Project: Lucene - Core > Issue Type: Improvement > Components: core/search > Reporter: Tom Burton-West > Priority: Minor > Attachments: LUCENE-5175.patch > > > In the article "When Documents Are Very Long, BM25 Fails!" a fix for the > problem is documented. There was a TODO note in BM25Similarity to add this > fix. I will attach a patch that implements the fix shortly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org