Hi, all. We’re upgrading an old Solr 3.5 setup (master/slave replication) to 
SolrCloud (v7 or v8) and with the addition of a new data center (for dual data 
centers). I’ve done a lot of homework, but could still use some advice. While 
documentation explains Zookeper and SolrCloud pretty well, I don’t get a 
comfortable sense for how to lay everything out physically in the architecture.

At present, we have planned the same physical hardware as what we had for our 
master/slave setup (basically, 2 servers). Now, however, we’re going to 
duplicate that so that we also have the same in another data center: US and 
Europe. For this, the Cross Data Center Replication (CDCR; bi-directional) 
seems appropriate, but I’m not confident. Also, for the best fault tolerance 
and high-availability, I’m not real sure how to layout my Zookeper nodes and my 
Solr instances/shards/replicas physically across the servers. I’d like to start 
with the simplest possible setup and scale up only if necessary. Our index size 
is relatively small, I guess: ~150,000 documents.

I’m worried, for example, about spreading the Zookeper cluster between the two 
data centers because of potential latency across the pond. Maybe we keep the ZK 
ensemble on one side of the pond only? I imagined, for instance,  2 ZK nodes on 
one server, and one on the other (in at least one data center). But maybe we 
need 5 ZKs, with 1 on each server in the other data center? Then how about the 
Solr nodes, shards, and replicas? If anybody has done some remotely similar 
setup for production purposes, I would be grateful for any tips (and down-right 
giddy for a diagram).

I know I’m probably not even providing enough information to begin with, but 
perhaps someone will entertain a conversation?

Thanks, in advance, for sharing some of your valuable time and experience.

Cody

Reply via email to