Kafka Cluster Failover

Andrew Otto Wed, 27 Nov 2013 09:54:54 -0800

Hi all!

Wikimedia is close to using Kafka to collect webrequest access logs from 
multiple data centers.  I know that MirrorMaker is the recommended way to do 
cross-DC Kafka, but this is a lot of overhead for our remote DCs.  To set up a 
highly available Kafka Cluster, we need to add a few more nodes in each DC 
(brokers and zookeepers).  Our remote DCs are used mainly for frontend web 
caching, and we'd like to keep them that way.  We don't want to have to add 
multiple nodes to each DC just for log delivery.


We are attempting to produce messages from the remote DCs directly to our main 
DC's Kafka cluster, but we are worried about data loss during potential times 
of high latency or link packet loss (we actually had this problem last 
weekend).  Most of the time this works, but it isn't reliable.

Would it be possible to somehow set up a single non-HA Kafka Broker in our 
remote DC and produce to that, but then failover to cross-DC production to our 
main DC Kafka Cluster?

We could use LVS or some other load balancer/proxy for the Kafka connections, 
and automatically switch between clusters based on availability.  But, what 
would this do to live producers and their metadata?  Would they be able to 
handle a total switch of cluster metadata?

Thanks!
-Andrew Otto

Kafka Cluster Failover

Reply via email to