On 9/7/2016 3:19 PM, Eric Young wrote: > I have a very large ZooKeeper cluster which manages config and replication > for multiple SolrCloud clusters. I want to split the monolithic ZooKeeper > cluster into smaller, more manageable clusters in a live migration (i.e. > minimal or no downtime).
The zookeeper list isn't really the right place for most of this. The residents of this list will to have zero knowledge of how Solr uses zookeeper. I'm on both lists -- and I'm a lot more familiar with Solr than Zookeeper. Because Solr normally will not place a large load on zookeeper, I personally would just use one zookeeper ensemble for both SolrCloud clusters, each using a different chroot in zookeeper. I'd use either three or five ZK servers, depending on how likely I thought it would be that I would need to survive two servers going down. That's not what you asked about though, so I will attempt to help you with what you DID ask about. > I have collections that can be updated dynamically which are already > separated logically in different SolrCloud clusters. I also have some > static collections (never updated) that have replicas across all the > SolrCloud clusters though. All my collections only have a single shard. > > ZooKeeper version: 3.4.6 > Solr version: 4.8.1 > > Example current setup (minimal): > ZK cluster servers: z1-1, z1-2, z1-3, z2-1, z2-2, z2-3 > Solr cluster 1 servers: s1-1, s1-2 > Solr cluster 2 servers: s2-1, s2-2 > > Example collections: > Dynamic collection 1: c1 (sharded on s1-1, s1-2) > Dynamic collection 2: c2 (sharded on s2-1, s2-2) > Static collection 1: c3 (sharded on all 4 Solr servers s1-1, s1-2, s2-1, > s2-2) If you have a collection that has replicas on all four Solr servers, then your four solr servers are *one* SolrCloud cluster, not two. If they were separate clusters, it would not be possible to have one collection with shards/replicas on all four servers. I really don't know what to do for the zookeeper part of this equation. Somebody else on this list will need to answer that. Downtime is not going to be avoidable. With careful planning and execution, you might be able to minimize it. The first thing you need to do is rearrange the static collection so it only lives on two of the Solr servers. To do this, you can use ADDREPLICA if addiitonal replicas are required, then DELETEREPLICA to remove it from two of the servers. At this point, you'll need to shut down all instances of Solr, make whatever changes are required to split the zookeeper cluster (which I can't help you with), and update zkHost in Solr so that each pair of servers only talks to the servers in its cluster. After making sure that both zk ensembles have all the information in them, you would then start your Solr servers back up. Then you'll want to manually edit the two clusterstates to remove all mention of the collections and servers that don't belong in each cluster, and after making sure each clusterstate is correct, restart all the Solr servers. You *might* be able to just use the DELETE action on the Collections API to delete collections instead of manually editing clusterstate, but I'm not 100% positive about that. Thanks, Shawn
