[jira] [Commented] (LUCENE-4100) Maxscore - Efficient Scoring

Stefan Pohl (JIRA) Wed, 02 Jan 2013 00:40:16 -0800

    [ 
https://issues.apache.org/jira/browse/LUCENE-4100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13542058#comment-13542058
 ]


Stefan Pohl commented on LUCENE-4100:
-------------------------------------

Otis, thank you for your interest and I wish everyone a happy New Year!

Regarding speedups, have a look at http://vimeo.com/44300228 from 12 minutes 
onwards. It obviously depends on your mix of queries and collection size, but 
on average more than 100% should be achievable for large collections and 
typical query sets, and much higher speedups for problem queries (many frequent 
terms).

The contribution as-is (after adaptation to latest Lucene API/code-base) could 
already be included as a separate Lucene module for people to use who can live 
with its current limitations (for static indexes only, smaller totalHitCount).
In my spare time (unfortunately not much of that recently), I continue working 
on different approaches to make this more general, but this is pure 
experimentation and any production-ready offspring of that will take longer 
than 4.x and might require API changes (such as the ones suggested above by 
Robert), so perhaps 5.0 is a good aim.

I suggest to continue rolling this ticket forward, or only attach version 5.0 
to it.
                
> Maxscore - Efficient Scoring
> ----------------------------
>
>                 Key: LUCENE-4100
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4100
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/codecs, core/query/scoring, core/search
>    Affects Versions: 4.0-ALPHA
>            Reporter: Stefan Pohl
>              Labels: api-change, patch, performance
>             Fix For: 4.2, 5.0
>
>         Attachments: contrib_maxscore.tgz, maxscore.patch
>
>
> At Berlin Buzzwords 2012, I will be presenting 'maxscore', an efficient 
> algorithm first published in the IR domain in 1995 by H. Turtle & J. Flood, 
> that I find deserves more attention among Lucene users (and developers).
> I implemented a proof of concept and did some performance measurements with 
> example queries and lucenebench, the package of Mike McCandless, resulting in 
> very significant speedups.
> This ticket is to get started the discussion on including the implementation 
> into Lucene's codebase. Because the technique requires awareness about it 
> from the Lucene user/developer, it seems best to become a contrib/module 
> package so that it consciously can be chosen to be used.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4100) Maxscore - Efficient Scoring

Reply via email to