GitHub user mjosephidou opened a pull request:
https://github.com/apache/lucene-solr/pull/300
SOLR-11831: Skip second grouping step if group.limit is 1 (aka Las Vegas
Patch)
Summary:
In cases where we do grouping and ask for {{group.limit=1}} only it is
possible to skip the second grouping step. In our test datasets it improved
speed by around 40%.
Essentially, in the first grouping step each shard returns the top K groups
based on the highest scoring document in each group. The top K groups from each
shard are merged in the federator and in the second step we ask all the shards
to return the top documents from each of the top ranking groups.
If we only want to return the highest scoring document per group we can
return the top document id in the first step, merge results in the federator to
retain the top K groups and then skip the second grouping step entirely.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/bloomberg/lucene-solr SOLR-11831
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/lucene-solr/pull/300.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #300
----
commit 6b918c86cd0f37320c32eb669eca722a9e74f768
Author: Malvina Josephidou <mjosephidou@...>
Date: 2018-01-04T15:00:35Z
SOLR-11831: Skip second grouping step if group.limit is 1 (aka Las Vegas
patch)
Summary:
In cases where we do grouping and ask for {{group.limit=1}} only it is
possible to skip the second grouping step. In our test datasets it improved
speed by around 40%.
Essentially, in the first grouping step each shard returns the top K groups
based on the highest scoring document in each group. The top K groups from each
shard are merged in the federator and in the second step we ask all the shards
to return the top documents from each of the top ranking groups.
If we only want to return the highest scoring document per group we can
return the top document id in the first step, merge results in the federator to
retain the top K groups and then skip the second grouping step entirely.
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]