[
https://issues.apache.org/jira/browse/LUCENE-4100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13542058#comment-13542058
]
Stefan Pohl commented on LUCENE-4100:
-------------------------------------
Otis, thank you for your interest and I wish everyone a happy New Year!
Regarding speedups, have a look at http://vimeo.com/44300228 from 12 minutes
onwards. It obviously depends on your mix of queries and collection size, but
on average more than 100% should be achievable for large collections and
typical query sets, and much higher speedups for problem queries (many frequent
terms).
The contribution as-is (after adaptation to latest Lucene API/code-base) could
already be included as a separate Lucene module for people to use who can live
with its current limitations (for static indexes only, smaller totalHitCount).
In my spare time (unfortunately not much of that recently), I continue working
on different approaches to make this more general, but this is pure
experimentation and any production-ready offspring of that will take longer
than 4.x and might require API changes (such as the ones suggested above by
Robert), so perhaps 5.0 is a good aim.
I suggest to continue rolling this ticket forward, or only attach version 5.0
to it.
> Maxscore - Efficient Scoring
> ----------------------------
>
> Key: LUCENE-4100
> URL: https://issues.apache.org/jira/browse/LUCENE-4100
> Project: Lucene - Core
> Issue Type: Improvement
> Components: core/codecs, core/query/scoring, core/search
> Affects Versions: 4.0-ALPHA
> Reporter: Stefan Pohl
> Labels: api-change, patch, performance
> Fix For: 4.2, 5.0
>
> Attachments: contrib_maxscore.tgz, maxscore.patch
>
>
> At Berlin Buzzwords 2012, I will be presenting 'maxscore', an efficient
> algorithm first published in the IR domain in 1995 by H. Turtle & J. Flood,
> that I find deserves more attention among Lucene users (and developers).
> I implemented a proof of concept and did some performance measurements with
> example queries and lucenebench, the package of Mike McCandless, resulting in
> very significant speedups.
> This ticket is to get started the discussion on including the implementation
> into Lucene's codebase. Because the technique requires awareness about it
> from the Lucene user/developer, it seems best to become a contrib/module
> package so that it consciously can be chosen to be used.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]