[ https://issues.apache.org/jira/browse/LUCENE-2959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13090685#comment-13090685 ]
Robert Muir commented on LUCENE-2959: ------------------------------------- I rearranged the BM25 in the branch a little bit, its now as fast as lucene's ranking formula: {noformat} Task QPS tfidf StdDev tfidf QPS bm25 StdDev bm25 Pct diff SpanNear 4.29 0.52 4.14 0.49 -24% - 22% Phrase 3.97 0.25 3.89 0.25 -13% - 11% Term 82.18 4.78 81.00 2.56 -9% - 7% TermBGroup1M1P 83.30 2.41 82.12 2.20 -6% - 4% SloppyPhrase 8.03 0.31 7.93 0.43 -10% - 8% AndHighHigh 19.38 0.59 19.16 0.71 -7% - 5% PKLookup 175.49 4.33 173.67 4.20 -5% - 3% AndHighMed 40.99 1.12 40.71 1.07 -5% - 4% TermGroup1M 25.69 0.39 25.69 0.44 -3% - 3% Fuzzy2 42.62 1.83 42.65 1.80 -8% - 8% Fuzzy1 91.74 3.48 91.86 3.44 -7% - 7% Respell 73.96 3.30 74.18 3.29 -8% - 9% Wildcard 56.33 0.97 56.60 1.08 -3% - 4% Prefix3 33.36 0.83 33.59 0.97 -4% - 6% TermBGroup1M 55.58 1.03 56.17 0.88 -2% - 4% IntNRQ 13.38 0.74 13.58 0.94 -10% - 14% OrHighMed 11.71 1.18 11.94 0.97 -14% - 22% OrHighHigh 8.91 0.74 9.13 0.63 -11% - 19% {noformat} > [GSoC] Implementing State of the Art Ranking for Lucene > ------------------------------------------------------- > > Key: LUCENE-2959 > URL: https://issues.apache.org/jira/browse/LUCENE-2959 > Project: Lucene - Java > Issue Type: New Feature > Components: core/query/scoring, general/javadocs, modules/examples > Reporter: David Mark Nemeskey > Assignee: Robert Muir > Labels: gsoc2011, lucene-gsoc-11, mentor > Fix For: flexscoring branch > > Attachments: LUCENE-2959_mockdfr.patch, implementation_plan.pdf, > proposal.pdf > > > Lucene employs the Vector Space Model (VSM) to rank documents, which compares > unfavorably to state of the art algorithms, such as BM25. Moreover, the > architecture is > tailored specically to VSM, which makes the addition of new ranking functions > a non- > trivial task. > This project aims to bring state of the art ranking methods to Lucene and to > implement a > query architecture with pluggable ranking functions. > The wiki page for the project can be found at > http://wiki.apache.org/lucene-java/SummerOfCode2011ProjectRanking. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org