[
https://issues.apache.org/jira/browse/LUCENE-5175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13738664#comment-13738664
]
Robert Muir commented on LUCENE-5175:
-------------------------------------
Hi Tom:
I know for a fact i tried to remove the crazy "cache" (I created the monster)
that this thing creates, and it always hurts performance for example.
But I don't think we need to worry too much because:
# We should benchmark it the way you have it first and just see what we are
dealing with
# IF there is a problem, we could try to open it up to subclassing better,
maybe it even improves the API
# There is also the option of just having specialized SimScorers for the
delta=0 case.
So I am confident we will find a good solution.
As far as luceneutil we tried creating a README
(http://code.google.com/a/apache-extras.org/p/luceneutil/source/browse/README.txt)
to get started.
The basic idea is you pull down 2 different checkouts of lucene-trunk and setup
a "competition" between the two. There are two options important here: one is
to set the similarity for each competitor, the other can disable score
comparisons (I havent yet examined the patch to tell if they might differ
slightly, e.g. order of floating point ops and stuff).
But thats typically how i benchmark two Sim impls against each other.
> Add parameter to lower-bound TF normalization for BM25 (for long documents)
> ---------------------------------------------------------------------------
>
> Key: LUCENE-5175
> URL: https://issues.apache.org/jira/browse/LUCENE-5175
> Project: Lucene - Core
> Issue Type: Improvement
> Components: core/search
> Reporter: Tom Burton-West
> Priority: Minor
> Attachments: LUCENE-5175.patch
>
>
> In the article "When Documents Are Very Long, BM25 Fails!" a fix for the
> problem is documented. There was a TODO note in BM25Similarity to add this
> fix. I will attach a patch that implements the fix shortly.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]