[
https://issues.apache.org/jira/browse/LUCENE-2959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Robert Muir updated LUCENE-2959:
--------------------------------
Attachment: LUCENE-2959_nocommits.patch
patch removing all nocommits
for the fake IDF/phrase issue, i thought it best not to "fake" statistics to
SimilarityBase, since the whole point is to make it simpler for
implementing/testing ranking models.
instead it sums scores across terms (kinda like boolean query)
for DFR P and D, I don't think there are really any great practical ways out of
the fundamental problem. I added notes to both of these.
i think the workaround for dirichlet is fine, i looked around and found another
implementation of this smoothing by hiemstra and it had the same workaround
(http://mirex.sourceforge.net
/ trec.nist.gov/pubs/trec19/papers/univ.twente.web.rev.pdf)
all the other similarities seem to work fine being randomly swapped into
lucene's tests.
> [GSoC] Implementing State of the Art Ranking for Lucene
> -------------------------------------------------------
>
> Key: LUCENE-2959
> URL: https://issues.apache.org/jira/browse/LUCENE-2959
> Project: Lucene - Java
> Issue Type: New Feature
> Components: core/query/scoring, general/javadocs, modules/examples
> Reporter: David Mark Nemeskey
> Assignee: Robert Muir
> Labels: gsoc2011, lucene-gsoc-11, mentor
> Fix For: flexscoring branch
>
> Attachments: LUCENE-2959_mockdfr.patch, LUCENE-2959_nocommits.patch,
> implementation_plan.pdf, proposal.pdf
>
>
> Lucene employs the Vector Space Model (VSM) to rank documents, which compares
> unfavorably to state of the art algorithms, such as BM25. Moreover, the
> architecture is
> tailored specically to VSM, which makes the addition of new ranking functions
> a non-
> trivial task.
> This project aims to bring state of the art ranking methods to Lucene and to
> implement a
> query architecture with pluggable ranking functions.
> The wiki page for the project can be found at
> http://wiki.apache.org/lucene-java/SummerOfCode2011ProjectRanking.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]