[ https://issues.apache.org/jira/browse/SOLR-11831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Cassandra Targett updated SOLR-11831: ------------------------------------- Component/s: Grouping > Skip second grouping step if group.limit is 1 (aka Las Vegas patch) > -------------------------------------------------------------------- > > Key: SOLR-11831 > URL: https://issues.apache.org/jira/browse/SOLR-11831 > Project: Solr > Issue Type: Improvement > Components: Grouping > Reporter: Malvina Josephidou > Priority: Minor > Time Spent: 17h 50m > Remaining Estimate: 0h > > In cases where we do grouping and ask for {{group.limit=1}} only it is > possible to skip the second grouping step. In our test datasets it improved > speed by around 40%. > Essentially, in the first grouping step each shard returns the top K groups > based on the highest scoring document in each group. The top K groups from > each shard are merged in the federator and in the second step we ask all the > shards to return the top documents from each of the top ranking groups. > If we only want to return the highest scoring document per group we can > return the top document id in the first step, merge results in the federator > to retain the top K groups and then skip the second grouping step entirely. > This is possible provided that: > a) We do not need to know the total number of matching documents per group > b) Within group sort and between group sort is the same. > c) We are not doing reranking (this is because this is done in the second > grouping step. It is also possible to get this to work with reranking but > more work and some additional assumptions are required) > > This patch applies the grouping optimisation in cases where a)-c) apply and > we are only sorting by relevance. It is also possible to extend this work to > handle multiple sorting criteria and also reranking. > P.S. Diego and I called this patch "las vegas" because we started to write it > on the flight to Las Vegas for Lucene/Solr revolution. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org