[ https://issues.apache.org/jira/browse/LUCENE-4100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13542058#comment-13542058 ]
Stefan Pohl commented on LUCENE-4100: ------------------------------------- Otis, thank you for your interest and I wish everyone a happy New Year! Regarding speedups, have a look at http://vimeo.com/44300228 from 12 minutes onwards. It obviously depends on your mix of queries and collection size, but on average more than 100% should be achievable for large collections and typical query sets, and much higher speedups for problem queries (many frequent terms). The contribution as-is (after adaptation to latest Lucene API/code-base) could already be included as a separate Lucene module for people to use who can live with its current limitations (for static indexes only, smaller totalHitCount). In my spare time (unfortunately not much of that recently), I continue working on different approaches to make this more general, but this is pure experimentation and any production-ready offspring of that will take longer than 4.x and might require API changes (such as the ones suggested above by Robert), so perhaps 5.0 is a good aim. I suggest to continue rolling this ticket forward, or only attach version 5.0 to it. > Maxscore - Efficient Scoring > ---------------------------- > > Key: LUCENE-4100 > URL: https://issues.apache.org/jira/browse/LUCENE-4100 > Project: Lucene - Core > Issue Type: Improvement > Components: core/codecs, core/query/scoring, core/search > Affects Versions: 4.0-ALPHA > Reporter: Stefan Pohl > Labels: api-change, patch, performance > Fix For: 4.2, 5.0 > > Attachments: contrib_maxscore.tgz, maxscore.patch > > > At Berlin Buzzwords 2012, I will be presenting 'maxscore', an efficient > algorithm first published in the IR domain in 1995 by H. Turtle & J. Flood, > that I find deserves more attention among Lucene users (and developers). > I implemented a proof of concept and did some performance measurements with > example queries and lucenebench, the package of Mike McCandless, resulting in > very significant speedups. > This ticket is to get started the discussion on including the implementation > into Lucene's codebase. Because the technique requires awareness about it > from the Lucene user/developer, it seems best to become a contrib/module > package so that it consciously can be chosen to be used. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org