[ 
https://issues.apache.org/jira/browse/SOLR-11831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16316593#comment-16316593
 ] 

Anshum Gupta commented on SOLR-11831:
-------------------------------------

[~mjosephidou] I think you forgot to attach the patch :)

>  Skip second grouping step if group.limit is 1 (aka Las Vegas patch)
> --------------------------------------------------------------------
>
>                 Key: SOLR-11831
>                 URL: https://issues.apache.org/jira/browse/SOLR-11831
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Malvina Josephidou
>            Priority: Minor
>
> In cases where we do grouping and ask for  {{group.limit=1}} only it is 
> possible to skip the second grouping step. In our test datasets it improved 
> speed by around 40%.
> Essentially, in the first grouping step each shard returns the top K groups 
> based on the highest scoring document in each group. The top K groups from 
> each shard are merged in the federator and in the second step we ask all the 
> shards to return the top documents from each of the top ranking groups.
> If we only want to return the highest scoring document per group we can 
> return the top document id in the first step, merge results in the federator 
> to retain the top K groups and then skip the second grouping step entirely. 
> This is possible provided that:
> a) We do not need to know the total number of matching documents per group
> b) Within group sort and between group sort is the same. 
> c) We are not doing reranking (this is because this is done in the second 
> grouping step. It is also possible to get this to work with reranking but 
> more work and some additional assumptions are required)
>  
> This patch applies the grouping optimisation in cases where a)-c) apply and 
> we are only sorting by relevance. It is also possible to extend this work to 
> handle multiple sorting criteria and also reranking. 
> P.S. Diego and I called this patch "las vegas" because we started to write it 
> on the flight to Las Vegas for Lucene/Solr revolution. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to