[jira] [Commented] (SOLR-3109) group=true requests result in numerous redundant shard requests

Martijn van Groningen (Commented) (JIRA) Sun, 12 Feb 2012 16:15:29 -0800

    [ 
https://issues.apache.org/jira/browse/SOLR-3109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13206571#comment-13206571
 ]


Martijn van Groningen commented on SOLR-3109:
---------------------------------------------

{quote}
The current TestDistributedGrouping test case is constructed in such a way that 
each record has a unique value for it's search group field (i1), so that there 
is never more than one record in any given search group. This style of indexing 
conforms to the restriction discussed earlier. This is likely the reason there 
were no test failures.
{quote}
I think this issue doesn't exist in the released versions of Solr / 4.0-dev. 
Due to the bug that all shards were queried for each ShardRequest instance and 
all the matching top search groups still arrived at the right shard. Only after 
applying the changes to TopGroupsShardRequestFactory I could let the 
distributed grouping test fail.
                
> group=true requests result in numerous redundant shard requests
> ---------------------------------------------------------------
>
>                 Key: SOLR-3109
>                 URL: https://issues.apache.org/jira/browse/SOLR-3109
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>    Affects Versions: 3.5, 4.0
>         Environment: 64-bit Linux, sharded environment
>            Reporter: Russell Black
>            Assignee: Martijn van Groningen
>            Priority: Critical
>              Labels: patch, performance
>             Fix For: 3.6, 4.0
>
>         Attachments: 
> SOLR-3109-Backport-of-grouping-performace-fix-to-3.x.patch, 
> SOLR-3109-lucene_solr_3_5.patch, SOLR-3109.patch, SOLR-3109.patch, 
> SOLR-3109.patch
>
>
> During the second phase of a group query, the collator sends a query to each 
> of the shards.  The purpose of this query is for shards to respond with the 
> doc ids that match the set of group ids returned from the first phase.  The 
> problem is that it sends this second query to each shard multiple times.  
> Specifically, in an environment with n shards, each shard will be hit with an 
> identical query n times during the second phase of query processing, 
> resulting in O(_n_ ^2^) performance where _n_ is the number of shards.
> I have traced this bug down to a single line in 
> {{TopGroupsShardRequestFactory.java}}, and I am attaching a patch. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3109) group=true requests result in numerous redundant shard requests

Reply via email to