[GitHub] [kafka-site] miguno commented on a change in pull request #324: KAFKA-8930: MirrorMaker v2 documentation

GitBox Mon, 25 Jan 2021 01:12:30 -0800


miguno commented on a change in pull request #324:
URL: https://github.com/apache/kafka-site/pull/324#discussion_r563562851




##########
File path: 27/ops.html
##########
@@ -553,7 +539,558 @@ <h3 class="anchor-heading"><a id="datacenters" 
class="anchor-link"></a><a href="
   <p>
   It is generally <i>not</i> advisable to run a <i>single</i> Kafka cluster 
that spans multiple datacenters over a high-latency link. This will incur very 
high replication latency both for Kafka writes and ZooKeeper writes, and 
neither Kafka nor ZooKeeper will remain available in all locations if the 
network between locations is unavailable.
 
-  <h3 class="anchor-heading"><a id="config" class="anchor-link"></a><a 
href="#config">6.3 Kafka Configuration</a></h3>
+  <h3 class="anchor-heading"><a id="georeplication" class="anchor-link"></a><a 
href="#georeplication">6.3 Geo-Replication (Cross-Cluster Data 
Mirroring)</a></h3>
+
+  <h4 class="anchor-heading"><a id="georeplication-overview" 
class="anchor-link"></a><a href="#georeplication-overview">Geo-Replication 
Overview</a></h4>
+
+  <p>
+    Kafka administrators can define data flows that cross the boundaries of 
individual Kafka clusters, data centers, or geo-regions. Such event streaming 
setups are often needed for organizational, technical, or legal requirements. 
Common scenarios include:
+  </p>
+
+  <ul>
+    <li>Geo-replication</li>
+    <li>Disaster recovery</li>
+    <li>Feeding edge clusters into a central, aggregate cluster</li>
+    <li>Physical isolation of clusters (such as production vs. testing)</li>
+    <li>Cloud migration or hybrid cloud deployments</li>
+    <li>Legal and compliance requirements</li>
+  </ul>
+
+  <p>
+    Administrators can set up such inter-cluster data flows with Kafka's 
MirrorMaker (version 2), a tool to replicate data between different Kafka 
environments in a streaming manner. MirrorMaker is built on top of the Kafka 
Connect framework and supports features such as:
+  </p>
+
+  <ul>
+    <li>Replicates topics (data plus configurations)</li>
+    <li>Replicates consumer groups including offsets to migrate applications 
between clusters</li>
+    <li>Replicates ACLs</li>
+    <li>Preserves partitioning</li>
+    <li>Automatically detects new topics and partitions</li>
+    <li>Provides a wide range of metrics, such as end-to-end replication 
latency across multiple data centers/clusters</li>
+    <li>Fault-tolerant and horizontally scalable operations</li>
+  </ul>
+
+  <p>
+  <em>Note: Geo-replication with MirrorMaker replicates data across Kafka 
clusters. This inter-cluster replication is different from Kafka's <a 
href="#replication">intra-cluster replication</a>, which replicates data within 
the same Kafka cluster.</em>
+  </p>
+
+  <h4 class="anchor-heading"><a id="georeplication-flows" 
class="anchor-link"></a><a href="#georeplication-flows">What Are Replication 
Flows</a></h4>
+
+  <p>
+    With MirrorMaker, Kafka administrators can replicate topics, topic 
configurations, consumer groups and their offsets, and ACLs from one or more 
source Kafka clusters to one or more target Kafka clusters, i.e., across 
cluster environments. In a nutshell, MirrorMaker consumes data from the source 
cluster with source connectors, and then replicates the data by producing to 
the target cluster with sink connectors.

Review comment:
       Thanks, @ryannedolan. Text updated.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [kafka-site] miguno commented on a change in pull request #324: KAFKA-8930: MirrorMaker v2 documentation

Reply via email to