Hi all,

I've just published a tiny extension to Lucene 4.0, which enables a mixture
of language models using standard FunctionQuery and ValueSource classes:
https://github.com/nzhiltsov/lucene-mlm

I'd like you to assess the possibility of integrating this code into
Lucene. Appreciate any comments or fixes.

NB. The implementation avoids using LMSimilarity per field basis,
because it would break the computation of correct Dirichlet priors for
non-matched terms, which the standard class LMSimilarity fails to include
while calculating term frequencies and treats them as zero probability
entries.

-- 

Nikita Zhiltsov

Visiting Graduate Student
Emory University
Intelligent Information Access Lab
E500 Emerson Hall, Atlanta, Georgia, USA
Phone: (404) 834-5364
E-mail: znik...@emory.edu


---------------------------------------------------------------------
Graduate Student, Research Fellow
Kazan Federal University
Computational Linguistics Laboratory
Russia, 420008
Kazan, Prof. Nuzhina Str., 1/37 room 117
Skype: nickita.jhiltsov
Personal page: http://cll.niimm.ksu.ru/~nzhiltsov
E-mail: nikita.zhilt...@gmail.com

---------------------------------------------------------------------

Reply via email to