[GitHub] [kafka-site] miguno commented on a change in pull request #324: KAFKA-8930: MirrorMaker v2 documentation

2021-01-25 Thread GitBox


miguno commented on a change in pull request #324:
URL: https://github.com/apache/kafka-site/pull/324#discussion_r563562309



##
File path: 27/ops.html
##
@@ -553,7 +539,558 @@ 6.3 Kafka Configuration
+  6.3 Geo-Replication (Cross-Cluster Data 
Mirroring)
+
+  Geo-Replication 
Overview
+
+  
+Kafka administrators can define data flows that cross the boundaries of 
individual Kafka clusters, data centers, or geo-regions. Such event streaming 
setups are often needed for organizational, technical, or legal requirements. 
Common scenarios include:
+  
+
+  
+Geo-replication
+Disaster recovery
+Feeding edge clusters into a central, aggregate cluster
+Physical isolation of clusters (such as production vs. testing)
+Cloud migration or hybrid cloud deployments
+Legal and compliance requirements
+  
+
+  
+Administrators can set up such inter-cluster data flows with Kafka's 
MirrorMaker (version 2), a tool to replicate data between different Kafka 
environments in a streaming manner. MirrorMaker is built on top of the Kafka 
Connect framework and supports features such as:
+  
+
+  
+Replicates topics (data plus configurations)
+Replicates consumer groups including offsets to migrate applications 
between clusters
+Replicates ACLs
+Preserves partitioning
+Automatically detects new topics and partitions
+Provides a wide range of metrics, such as end-to-end replication 
latency across multiple data centers/clusters
+Fault-tolerant and horizontally scalable operations
+  
+
+  
+  Note: Geo-replication with MirrorMaker replicates data across Kafka 
clusters. This inter-cluster replication is different from Kafka's intra-cluster replication, which replicates data within 
the same Kafka cluster.
+  
+
+  What Are Replication 
Flows
+
+  
+With MirrorMaker, Kafka administrators can replicate topics, topic 
configurations, consumer groups and their offsets, and ACLs from one or more 
source Kafka clusters to one or more target Kafka clusters, i.e., across 
cluster environments. In a nutshell, MirrorMaker consumes data from the source 
cluster with source connectors, and then replicates the data by producing to 
the target cluster with sink connectors.
+  
+
+  
+These directional flows from source to target clusters are called 
replication flows. They are defined with the format 
{source_cluster}->{target_cluster} in the MirrorMaker 
configuration file as described later. Administrators can create complex 
replication topologies based on these flows.
+  
+
+  
+Here are some example patterns:
+  
+
+  
+Active/Active high availability deployments: A->B, 
B->A
+Active/Passive or Active/Standby high availability deployments: 
A->B
+Aggregation (e.g., from many clusters to one): A->K, B->K, 
C->K
+Fan-out (e.g., from one to many clusters): K->A, K->B, 
K->C
+Forwarding: A->B, B->C, C->D
+  
+
+  
+By default, a flow replicates all topics and consumer groups. However, 
each replication flow can be configured independently. For instance, you can 
define that only specific topics or consumer groups are replicated from the 
source cluster to the target cluster.
+  
+
+  
+Here is a first example on how to configure data replication from a 
primary cluster to a secondary cluster (an 
active/passive setup):
+  
+
+# Basic settings
+clusters = primary, secondary
+primary.bootstrap.servers = broker3-primary:9092
+secondary.bootstrap.servers = broker5-secondary:9092
+
+# Define replication flows
+primary->secondary.enable = true
+primary->secondary.topics = foobar-topic, quux-.*
+
+
+
+  Configuring 
Geo-Replication
+
+  
+The following sections describe how to configure and run a dedicated 
MirrorMaker cluster. If you want to run MirrorMaker within an existing Kafka 
Connect cluster or other supported deployment setups, please refer to https://cwiki.apache.org/confluence/display/KAFKA/KIP-382%3A+MirrorMaker+2.0;>KIP-382:
 MirrorMaker 2.0 and be aware that the names of configuration settings may 
vary between deployment modes.
+  
+
+  
+Beyond what's covered in the following sections, further examples and 
information on configuration settings are available at:
+  
+
+  
+ https://github.com/apache/kafka/blob/trunk/connect/mirror/src/main/java/org/apache/kafka/connect/mirror/MirrorMakerConfig.java;>MirrorMakerConfig,
 https://github.com/apache/kafka/blob/trunk/connect/mirror/src/main/java/org/apache/kafka/connect/mirror/MirrorConnectorConfig.java;>MirrorConnectorConfig
+ https://github.com/apache/kafka/blob/trunk/connect/mirror/src/main/java/org/apache/kafka/connect/mirror/DefaultTopicFilter.java;>DefaultTopicFilter
 for topics, https://github.com/apache/kafka/blob/trunk/connect/mirror/src/main/java/org/apache/kafka/connect/mirror/DefaultGroupFilter.java;>DefaultGroupFilter
 for consumer groups
+ Example configuration settings in 

[GitHub] [kafka-site] miguno commented on a change in pull request #324: KAFKA-8930: MirrorMaker v2 documentation

2021-01-25 Thread GitBox


miguno commented on a change in pull request #324:
URL: https://github.com/apache/kafka-site/pull/324#discussion_r563562851



##
File path: 27/ops.html
##
@@ -553,7 +539,558 @@ 6.3 Kafka Configuration
+  6.3 Geo-Replication (Cross-Cluster Data 
Mirroring)
+
+  Geo-Replication 
Overview
+
+  
+Kafka administrators can define data flows that cross the boundaries of 
individual Kafka clusters, data centers, or geo-regions. Such event streaming 
setups are often needed for organizational, technical, or legal requirements. 
Common scenarios include:
+  
+
+  
+Geo-replication
+Disaster recovery
+Feeding edge clusters into a central, aggregate cluster
+Physical isolation of clusters (such as production vs. testing)
+Cloud migration or hybrid cloud deployments
+Legal and compliance requirements
+  
+
+  
+Administrators can set up such inter-cluster data flows with Kafka's 
MirrorMaker (version 2), a tool to replicate data between different Kafka 
environments in a streaming manner. MirrorMaker is built on top of the Kafka 
Connect framework and supports features such as:
+  
+
+  
+Replicates topics (data plus configurations)
+Replicates consumer groups including offsets to migrate applications 
between clusters
+Replicates ACLs
+Preserves partitioning
+Automatically detects new topics and partitions
+Provides a wide range of metrics, such as end-to-end replication 
latency across multiple data centers/clusters
+Fault-tolerant and horizontally scalable operations
+  
+
+  
+  Note: Geo-replication with MirrorMaker replicates data across Kafka 
clusters. This inter-cluster replication is different from Kafka's intra-cluster replication, which replicates data within 
the same Kafka cluster.
+  
+
+  What Are Replication 
Flows
+
+  
+With MirrorMaker, Kafka administrators can replicate topics, topic 
configurations, consumer groups and their offsets, and ACLs from one or more 
source Kafka clusters to one or more target Kafka clusters, i.e., across 
cluster environments. In a nutshell, MirrorMaker consumes data from the source 
cluster with source connectors, and then replicates the data by producing to 
the target cluster with sink connectors.

Review comment:
   Thanks, @ryannedolan. Text updated.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka-site] miguno commented on a change in pull request #324: KAFKA-8930: MirrorMaker v2 documentation

2021-01-25 Thread GitBox


miguno commented on a change in pull request #324:
URL: https://github.com/apache/kafka-site/pull/324#discussion_r563562309



##
File path: 27/ops.html
##
@@ -553,7 +539,558 @@ 6.3 Kafka Configuration
+  6.3 Geo-Replication (Cross-Cluster Data 
Mirroring)
+
+  Geo-Replication 
Overview
+
+  
+Kafka administrators can define data flows that cross the boundaries of 
individual Kafka clusters, data centers, or geo-regions. Such event streaming 
setups are often needed for organizational, technical, or legal requirements. 
Common scenarios include:
+  
+
+  
+Geo-replication
+Disaster recovery
+Feeding edge clusters into a central, aggregate cluster
+Physical isolation of clusters (such as production vs. testing)
+Cloud migration or hybrid cloud deployments
+Legal and compliance requirements
+  
+
+  
+Administrators can set up such inter-cluster data flows with Kafka's 
MirrorMaker (version 2), a tool to replicate data between different Kafka 
environments in a streaming manner. MirrorMaker is built on top of the Kafka 
Connect framework and supports features such as:
+  
+
+  
+Replicates topics (data plus configurations)
+Replicates consumer groups including offsets to migrate applications 
between clusters
+Replicates ACLs
+Preserves partitioning
+Automatically detects new topics and partitions
+Provides a wide range of metrics, such as end-to-end replication 
latency across multiple data centers/clusters
+Fault-tolerant and horizontally scalable operations
+  
+
+  
+  Note: Geo-replication with MirrorMaker replicates data across Kafka 
clusters. This inter-cluster replication is different from Kafka's intra-cluster replication, which replicates data within 
the same Kafka cluster.
+  
+
+  What Are Replication 
Flows
+
+  
+With MirrorMaker, Kafka administrators can replicate topics, topic 
configurations, consumer groups and their offsets, and ACLs from one or more 
source Kafka clusters to one or more target Kafka clusters, i.e., across 
cluster environments. In a nutshell, MirrorMaker consumes data from the source 
cluster with source connectors, and then replicates the data by producing to 
the target cluster with sink connectors.
+  
+
+  
+These directional flows from source to target clusters are called 
replication flows. They are defined with the format 
{source_cluster}->{target_cluster} in the MirrorMaker 
configuration file as described later. Administrators can create complex 
replication topologies based on these flows.
+  
+
+  
+Here are some example patterns:
+  
+
+  
+Active/Active high availability deployments: A->B, 
B->A
+Active/Passive or Active/Standby high availability deployments: 
A->B
+Aggregation (e.g., from many clusters to one): A->K, B->K, 
C->K
+Fan-out (e.g., from one to many clusters): K->A, K->B, 
K->C
+Forwarding: A->B, B->C, C->D
+  
+
+  
+By default, a flow replicates all topics and consumer groups. However, 
each replication flow can be configured independently. For instance, you can 
define that only specific topics or consumer groups are replicated from the 
source cluster to the target cluster.
+  
+
+  
+Here is a first example on how to configure data replication from a 
primary cluster to a secondary cluster (an 
active/passive setup):
+  
+
+# Basic settings
+clusters = primary, secondary
+primary.bootstrap.servers = broker3-primary:9092
+secondary.bootstrap.servers = broker5-secondary:9092
+
+# Define replication flows
+primary->secondary.enable = true
+primary->secondary.topics = foobar-topic, quux-.*
+
+
+
+  Configuring 
Geo-Replication
+
+  
+The following sections describe how to configure and run a dedicated 
MirrorMaker cluster. If you want to run MirrorMaker within an existing Kafka 
Connect cluster or other supported deployment setups, please refer to https://cwiki.apache.org/confluence/display/KAFKA/KIP-382%3A+MirrorMaker+2.0;>KIP-382:
 MirrorMaker 2.0 and be aware that the names of configuration settings may 
vary between deployment modes.
+  
+
+  
+Beyond what's covered in the following sections, further examples and 
information on configuration settings are available at:
+  
+
+  
+ https://github.com/apache/kafka/blob/trunk/connect/mirror/src/main/java/org/apache/kafka/connect/mirror/MirrorMakerConfig.java;>MirrorMakerConfig,
 https://github.com/apache/kafka/blob/trunk/connect/mirror/src/main/java/org/apache/kafka/connect/mirror/MirrorConnectorConfig.java;>MirrorConnectorConfig
+ https://github.com/apache/kafka/blob/trunk/connect/mirror/src/main/java/org/apache/kafka/connect/mirror/DefaultTopicFilter.java;>DefaultTopicFilter
 for topics, https://github.com/apache/kafka/blob/trunk/connect/mirror/src/main/java/org/apache/kafka/connect/mirror/DefaultGroupFilter.java;>DefaultGroupFilter
 for consumer groups
+ Example configuration settings in