Re: "final" modifier on some methods in TFIDFSimilarity class

Hafiz Hamid Fri, 24 Oct 2014 10:45:00 -0700

Alan - Thanks for the idea. We don't want to invent a new scoring formula,
hence a new Similarity class. While fully leveraging what
DefaultSimilarity/TFIDFSimilarity already provides, we only want to
override computation of a single component (i.e. fieldNorm) of existing
tf-idf based scoring. Creating a new class would require copy/paste of
existing TFIDFSimilarity code and would make it hard to upgrade and keep
things in sync with future versions. Also changing it in the original code
would allow others to benefit from it without posing any risks.

In case you're interested, we want to move the length-norm computation from
index time to search time. That will allow us to change the length-norm
function and A/B test it against the default, without having to re-create
the index which is an extremely expensive task for us. We'll simply store
the raw field length (#terms) as fieldNorm and will change the scorer to
compute length-norm from it at search time.

Thanks,
Hamid

On Fri, Oct 24, 2014 at 2:21 AM, Alan Woodward <a...@flax.co.uk> wrote:

> Hi Hamid,
>
> Can't you just extend Similarity instead?
>
> Alan Woodward
> www.flax.co.uk
>
>
> On 24 Oct 2014, at 08:04, Hafiz Hamid wrote:
>
> Hi - I wanted to check if folks would be okay with removing the "final"
> modifier from 4 methods (i.e. computeNorm,computeWeight, exactSimScorer
> and sloppySimScorer) in Lucene's TFIDFSimilarity class. It doesn't look
> like allowing to override these methods would have any negative
> implications on the function of this class. Yet it'd enable us tune the
> tf-idf scoring provided by this class to better serve our needs.
>
> I've logged a Jira issue for this: LUCENE-6023
> <https://issues.apache.org/jira/browse/LUCENE-6023>. If folks don't have
> any objection, I've a patch ready and can upload it.
>
> Thanks,
> Hamid
>
>
>

Re: "final" modifier on some methods in TFIDFSimilarity class

Reply via email to