Hi, Here are some of my experiences with MirrorMaker, but I'm also eager to read what others do:
1. Main issue for me is rebalancing. If you have several instances of MM under the same group, when one of them dies, loses network connectivity, or you just need to add new partitions to whitelist/blacklist - the whole thing starts rebalancing, which can take quite a bit of time (depending on the number of MM instances in a group) during which time the messages are either not being mirrored or are sent sporadically. Initially I had one MM group to mirror from 3 remote clusters into main cluster, which proved to be a big mistake - when one cluster would be unreachable, others would suffer. Now I run separate groups for each cluster and thinking of drilling it down even further and splitting it by topic. 2. Generally MM is pretty straight forward. At least the way I do it, is set num.streams to the number of partitions in the remote cluster and num.producers to the number of partitions in the destination cluster. The only thing to be cautious about is num.consumer.fetchers parameter, because it doesn't do what documentation says. See https://issues.apache.org/jira/browse/KAFKA-2008 3. As for monitoring, I track the number of incoming messages into source cluster and the same for destination cluster. Then graph them on the same panel - as long as the two are about the same, it's a good indicator mirroring is working fine. Karolis Best Regards, [cid:adform-logo-signature_b49a8980-6dcb-4066-8392-780ab0a91ccf.png] <http://site.adform.com/> Karolis Pocius IT System Engineer Email: [email protected]<mailto:[email protected]> Mobile: +370 620 22108 Sporto g. 18, LT-09238 Vilnius, Lithuania Adform Insider News<http://blog.adform.com/> [cid:iab_82fe09f0-2828-4e7b-909e-b5e19c3cd91e.png] Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete this message and any copies. On Thu, 2016-12-15 at 14:28 +0000, Greenhorn Techie wrote: Hi, Good Afternoon. We are implementing Kafka MirrorMaker to replicate data from Production Kafka cluster to DR Kafka cluster. I'm trying to understand answers to the following queries: what are the known bottlenecks / issues one needs to be aware from a MirrorMaker perspective. Also are there any documented or accepted best practices while using MirrorMaker? How to monitor MirrorMaker replication jobs to identify any issues during the job execution there-by alerting the DevOps/Support team? Would be grateful to hear opinions from experts out there. Thanks
