Compare with classical VSM, lucene just ignore the denominator (|Q|*|D|) of
similarity formula,
but it add norm(t,d) and coord(q,d) to calculate the fraction of terms in
Query and Doc,
so it's a modified implementation of VSM in practice.
 Do you just want to verify which implementation of VSM in "ieee-sw-rank" is
more precise in practice by lucene?
If so, it's an useful experiment.

2008/2/27, Dharmalingam <[EMAIL PROTECTED]>:
>
>
> Hi List,
>
> I am pretty new to Lucene. Certainly, it is very exciting. I need to
> implement a new Similarity class based on the Term Vector Space Model
> given
> in http://www.miislita.com/term-vector/term-vector-3.html
>
> Although that model is similar to Lucene's model
> (
> http://hudson.zones.apache.org/hudson/job/Lucene-trunk/javadoc//org/apache/lucene/search/Similarity.html
> ),
> I am having hard time to extend the Similarity class to calculate that
> model.
>
> In that model, "tf" is multiplied with Idf for all terms in the index, but
> in Lucene "tf" is calculated only for terms in the given Query. Because of
> that effect, the norm calculation should also include "idf" for all terms.
> Lucene calculates the norm, during indexing, by "just" counting the number
> of terms per document. In the web formula (in miislita.com), a document
> norm
> is calculated after multiplying "tf" and "idf".
>
> FYI: I could implement "idf" according to miisliat.com formula, but not
> the
> "tf" and "norm"
>
> Could you please comment me how I can implement a new Similarity class
> that
> will fit in the Lucene's architecture, but still implement the vector
> space
> model given in miislita.com
>
> Thanks a lot for your comments,
>
> Dharma
>
>
> --
> View this message in context:
> http://www.nabble.com/Vector-Space-Model%3A-New-Similarity-Implementation-Issues-tp15696719p15696719.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>

Reply via email to