[ 
https://issues.apache.org/jira/browse/SOLR-5381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13803686#comment-13803686
 ] 

Noble Paul edited comment on SOLR-5381 at 10/24/13 11:43 AM:
-------------------------------------------------------------

bq.t would like to know the state of the other shards as it often has to query 
against all the shards in a collection

You are missing the point that it's very unlikely for anyone to query across 
all shards in a VERY LARGE cluster. It is going to be almost useless and will 
bring the whole cluster down to a crawl. In a VERY LARGE cluster a node needs 
to know about  other shards only when it gets a request/update for another 
shard. But even that may be a rare case if you use SolrJ as your client which 
will route requests at the client level. We will have to attack this problem 
sooner or later if we actually take SolrCloud to a very large scale .I'm not 
saying this has to be the first step. We will pluck the low hanging fruits 
first of course




was (Author: noble.paul):
bq.t would like to know the state of the other shards as it often has to query 
against all the shards in a collection

You are missing the point that it's very unlikely for anyone to query across 
all shards in a VERY LARGE cluster. It is going to be almost useless and will 
bring the whole cluster down to a crawl. A node needs to know about  other 
shards only when it gets a request /update for another shard. But even that may 
be a rare case if you use SolrJ as your client which will route requests at the 
client level. We will have to attack this problem sooner or later if we 
actually take SolrCloud to a very large scale .I'm not saying this has to be 
the first step. We will pluck the low hanging fruits first of course



> Split Clusterstate and scale 
> -----------------------------
>
>                 Key: SOLR-5381
>                 URL: https://issues.apache.org/jira/browse/SOLR-5381
>             Project: Solr
>          Issue Type: Improvement
>          Components: SolrCloud
>            Reporter: Noble Paul
>            Assignee: Noble Paul
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> clusterstate.json is a single point of contention for all components in 
> SolrCloud. It would be hard to scale SolrCloud beyond a few thousand nodes 
> because there are too many updates and too many nodes need to be notified of 
> the changes. As the no:of nodes go up the size of clusterstate.json keeps 
> going up and it will soon exceed the limit impossed by ZK.
> The first step is to store the shards information in separate nodes and each 
> node can just listen to the shard node it belongs to. We may also need to 
> split each collection into its own node and the clusterstate.json just 
> holding the names of the collections .
> This is an umbrella issue



--
This message was sent by Atlassian JIRA
(v6.1#6144)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to