[ https://issues.apache.org/jira/browse/SOLR-5038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14547440#comment-14547440 ]
Naveen Kumar commented on SOLR-5038: ------------------------------------ Is anyone working on MMR in search ranking ? > Diversity Search Result In Rank > ------------------------------- > > Key: SOLR-5038 > URL: https://issues.apache.org/jira/browse/SOLR-5038 > Project: Solr > Issue Type: New Feature > Components: SearchComponents - other > Environment: irelevant > Reporter: Alon Lanyado > Labels: features > Original Estimate: 120h > Remaining Estimate: 120h > > We would like to add a Diversity SearchComponent/RequestHandler for Solr. > We will implement MMR(Maximal Marginal Relevance) which is one of the > simplest algorithms for this problem, in the next version we will improve it. > The Idea is that you have a lot of similar documents in your search result > (duplicates and near-duplicates that you must index) and the rank is showing > all those documents one by one - it's a very common problem for organizations. > We need to return a bigger list of documents from the searcher (a parameter > need to be chosen based on system performance) run MMR calculation in their > scoring: > lamda * OldRank + (1-lamda)*min_similarity{similarity of current document to > the subset of documents already chosen to return in search results} > lamda is parameter between 0-1 - the strong of the diversity. > min_similarity is calculated based on lucene default similarity (TF-IDF) for > the subset of already chosen documents. > The new score will represent a combination of relevance score and diversity > from other documents. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org