On 2/26/2015 11:14 PM, Damien Kamerman wrote:
> I've run into an issue with starting my solr cloud with many collections.
> My setup is:
> 3 nodes (solr 4.10.3 ; 64GB RAM each ; jdk1.8.0_25) running on a single
> server (256GB RAM).
> 5,000 collections (1 x shard ; 2 x replica) = 10,000 cores
> 1 x Zookeeper 3.4.6
> Java arg -Djute.maxbuffer=67108864 added to solr and ZK.
> 
> Then I stop all nodes, then start all nodes. All replicas are in the down
> state, some have no leader. At times I have seen some (12 or so) leaders in
> the active state. In the solr logs I see lots of:
> 
> org.apache.solr.cloud.ZkController; Still seeing conflicting information
> about the leader of shard shard1 for collection DDDDDD-4351 after 30
> seconds; our state says http://ftea1:8001/solr/DDDDDD-4351_shard1_replica1/,
> but ZooKeeper says http://ftea1:8000/solr/DDDDDD-4351_shard1_replica2/

<snip>

> I've tried staggering the starts (1min) but does not help.
> I've reproduced with zero documents.
> Restarts are OK up to around 3,000 cores.
> Should this work?

This is going to push SolrCloud beyond its limits.  Is this just an
exercise to see how far you can push Solr, or are you looking at setting
up a production install with several thousand collections?

In Solr 4.x, the clusterstate is one giant JSON structure containing the
state of the entire cloud.  With 5000 collections, the entire thing
would need to be downloaded and uploaded at least 5000 times during the
course of a successful full system startup ... and I think with
replicationFactor set to 2, that might actually be 10000 times. The
best-case scenario is that it would take a VERY long time, the
worst-case scenario is that concurrency problems would lead to a
deadlock.  A deadlock might be what is happening here.

In Solr 5.x, the clusterstate is broken up so there's a separate state
structure for each collection.  This setup allows for faster and safer
multi-threading and far less data transfer.  Assuming I understand the
implications correctly, there might not be any need to increase
jute.maxbuffer with 5.x ... although I have to assume that I might be
wrong about that.

I would very much recommend that you set your scenario up from scratch
in Solr 5.0.0, to see if the new clusterstate format can eliminate the
problem you're seeing.  If it doesn't, then we can pursue it as a likely
bug in the 5.x branch and you can file an issue in Jira.

Thanks,
Shawn

Reply via email to