Let's say I was primarily interested in ensuring there is a DR copy of the search index that is replicated to the remote data center, but I do not want the Solr instances in the remote data center to be part of the SolrCloud cluster, and that I am willing to accept some downtime in bringing up a Solr cluster in the remote data center if we have to use it. Can I use old http-based replication from a remote slave against one of the SolrCloud servers to accomplish that?
Primary Data Center 3 x Zookeeper 2 x Solr (clustered via SolrCloud) 1 x collection 1 x shard Remote Data Center 1 x Solr (configured as standalone replication slave against one of the primary data center Solr servers) Would this work to at least get the data to the remote data center in a reliable way? Thanks, Darrell -----Original Message----- From: Shawn Heisey [mailto:s...@elyograg.org] Sent: Wednesday, February 05, 2014 12:39 AM To: solr-user@lucene.apache.org Subject: Re: SolrCloud multiple data center support On 2/4/2014 10:14 PM, Darrell Burgan wrote: > Interesting about the Zookeeper quorum problem. What if we were to run three > Zookeepers in our primary data center and four in the backup data center. If > we failed over, we wouldn't have a quorum, but we could kill one of the > Zookeepers to restore a quorum, couldn't we? If we did extend the SolrCloud > cluster into a second data center, wouldn't queries against the cluster be > routed to the second data center sometimes? If you have seven zookeeper servers in your ensemble, at least four of them must be operational to have quorum. With N instances, int(N/2)+1 of them need to be running. In order to restore quorum when a data center outage takes out half your quorum, you would need to reconfigure each surviving instance in the cluster so that it had fewer servers in it, then restart all the ZK instances. I have no idea what would happen when the down data center is restored, but to get it working right, you'd have to reconfigure and restart again. Zookeeper simply isn't designed to deal with data center failure in a two center scenario. You can have workable solution if you have at least three data centers and you assume that you won't ever have a situation where more than one goes down. I don't know that you can make that assumption, of course. If you have replicas for one collection in two data centers, SolrCloud will direct queries to all of the replicas, meaning that some of them will have high latency. There is currently no logic to specify or prefer "local" replicas. Right now the only viable solution with two data centers is independent SolrCloud installs that are kept up to date independently. I've never looked at Flume. My indexing program will update multiple independent copies of the index. All my servers are in the same location, but it would theoretically work with multiple locations too. Thanks, Shawn