Hi Franz,

The MirrorMaker instances are colocated with the brokers, yes. These are beefy, 
dedicated hosts that are handling the loads admirably.

The core cluster receives about 400k msg/sec, 1GB/sec across 20 topics at peak 
times. CPU usage occasionally crosses 50% during peak times. If I find that the 
hardware is getting overloaded, or that our consumer load on the core cluster 
increases significantly, I will move MM to a separate set of hosts. Right now, 
it’s quite cost-effective as is. :)

—
Peter


> On Mar 12, 2019, at 10:02 AM, Franz van Betteraey <fvbetter...@web.de> wrote:
> 
> Hi Peter,
> 
> these are remarkable numbers but to be honest I do not get where you run the 
> Mirror Maker processes. 
> Do you run them near the remote clusters or near the target (core?) 
> datacenter cluster?
> 
> As I understand you run 30 MirrorMaker Instances (one for each remote 
> cluster) on each of the 100 Kafka Nodes of your core datacenter cluster.
> So you run the Mirror Maker on the same machine as the Kafka Nodes and do not 
> use a dedicated machines for the Mirror Maker process?
> 
> 
> Best regards,
>  Franz
>  
> 
> Gesendet: Dienstag, 12. März 2019 um 16:24 Uhr
> Von: "Peter Bukowinski" <pmb...@gmail.com <mailto:pmb...@gmail.com>>
> An: users@kafka.apache.org <mailto:users@kafka.apache.org>
> Betreff: Re: Kafka Mirror Maker place of execution
> I have a setup with about 30 remote kafka clusters and one cluster in a core 
> datacenter where I aggregate data from all the remote clusters. The remote 
> clusters have 30 nodes each with moderate specs. The core cluster has 100 
> nodes with lots of cpu, ram, and ssd storage per node.
> 
> I run MirrorMaker directly on the core brokers. Each broker runs one 
> MirrorMaker instance per edge cluster, sharing the same group.id. Since I’m 
> running 100 instances per edge cluster, the number of threads I use = (total 
> partition count of topics I am mirroring) / 100. In practice, each MM 
> instance runs with about 25 threads, so each broker runs 25*30=750 threads of 
> MirrorMaker.
> 
> I’ve been running this setup for many months and it’s proved to be stable 
> with very low consumer lag.
> 
> --
> Peter Bukowinski
> 
>> On Mar 12, 2019, at 6:42 AM, Ryanne Dolan <ryannedo...@gmail.com> wrote:
>> 
>> Franz, you can run MM on or near either source or target cluster, but it's
>> more efficient near the target because this minimizes producer latency. If
>> latency is high, poducers will block waiting on ACKs for in-flight records,
>> which reduces throughput.
>> 
>> I recommend running MM near the target cluster but not necessarily on the
>> same machines, because often Kafka nodes are relatively expensive, with SSD
>> arrays and huge IO bandwidth etc, which isn't necessary for MM.
>> 
>> Ryanne
>> 
>> On Tue, Mar 12, 2019, 8:13 AM Franz van Betteraey <fvbetter...@web.de>
>> wrote:
>> 
>>> Hi all,
>>> 
>>> there are best practices out there which recommend to run the Mirror Maker
>>> on the target cluster.
>>> 
>>> https://community.hortonworks.com/articles/79891/kafka-mirror-maker-best-practices.html
>>> 
>>> I wonder why this recommendation exists because ultimately all data must
>>> cross the border between the clusters, regardless of whether they are
>>> consumed at the target or produced at the source. A reason I can imagine is
>>> that the Mirror Maker supports multimple consumer but only one producer -
>>> so consuming data on the way with the greater latency might be speed up by
>>> the use of multiple consumers.
>>> 
>>> If performance because of multi threading is a point, would it be usefaul
>>> to use several producer (one per consumer) to replicate the data (with a
>>> custom replication process)? Does anyone knows why the Mirror Maker shares
>>> a single producer among all consumers?
>>> 
>>> My usecase is the replication of data from several source cluster (~10) to
>>> a single target cluster. I would prefer to run the replication process on
>>> the source cluster to avoid to many replication processes (each for one
>>> source) on the target cluster.
>>> 
>>> Hints and suggestions on this topic are very welcome.
>>> 
>>> Best regards
>>> Franz
>>> 
>>> If you would like to earn some SO recommendation points feel free to
>>> answer this question on SO ;-)
>>> https://stackoverflow.com/q/55122268/367285[https://stackoverflow.com/q/55122268/367285
>>>  
>>> <https://stackoverflow.com/q/55122268/367285[https://stackoverflow.com/q/55122268/367285>]

Reply via email to