[
https://issues.apache.org/jira/browse/SOLR-3109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13204362#comment-13204362
]
Martijn van Groningen edited comment on SOLR-3109 at 2/9/12 9:11 AM:
---------------------------------------------------------------------
Thanks for the refactoring!
{quote}
The bug is in the data structure used to map search groups to the shards which
contain them. ResponseBuilder.searchGroupToShard assumes that a given search
group only resides on one shard. I could not find this assumption documented
anywhere, nor can I find a reason such a restriction need be imposed.
{quote}
There is no such restriction. A search group can reside on more than one shard.
I wonder why this issue didn't result in test failure / bugs from the
beginning. I guess b/c of the redundant requests all shards were queried and
this way the end result was still correct. At least the latest patch I added
should have resulted in a test failure but it didn't. Can you share how you did
this testing? This can then be added to the TestDistributedGrouping test class.
was (Author: martijn.v.groningen):
Thanks for the refactoring!
{quote}
The bug is in the data structure used to map search groups to the shards which
contain them. ResponseBuilder.searchGroupToShard assumes that a given search
group only resides on one shard. I could not find this assumption documented
anywhere, nor can I find a reason such a restriction need be imposed.
{quote}
There is not such restriction. A search group can reside on more than one
shard. I wonder why this issue didn't result in test failure / bugs from the
beginning. I guess b/c of the redundant requests all shards were queried and
this way the end result was still correct. At least the latest patch I added
should have resulted in a test failure but it didn't. Can you share how you did
this testing? This can then be added to the TestDistributedGrouping test class.
> group=true requests result in numerous redundant shard requests
> ---------------------------------------------------------------
>
> Key: SOLR-3109
> URL: https://issues.apache.org/jira/browse/SOLR-3109
> Project: Solr
> Issue Type: Bug
> Components: search
> Affects Versions: 3.5, 4.0
> Environment: 64-bit Linux, sharded environment
> Reporter: Russell Black
> Assignee: Martijn van Groningen
> Priority: Critical
> Labels: patch, performance
> Attachments: SOLR-3109.patch, SOLR-3109.patch, SOLR-3109.patch
>
>
> During the second phase of a group query, the collator sends a query to each
> of the shards. The purpose of this query is for shards to respond with the
> doc ids that match the set of group ids returned from the first phase. The
> problem is that it sends this second query to each shard multiple times.
> Specifically, in an environment with n shards, each shard will be hit with an
> identical query n times during the second phase of query processing,
> resulting in O(_n_ ^2^) performance where _n_ is the number of shards.
> I have traced this bug down to a single line in
> {{TopGroupsShardRequestFactory.java}}, and I am attaching a patch.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]