codelipenghui commented on code in PR #23124: URL: https://github.com/apache/pulsar/pull/23124#discussion_r1714267316
########## pip/pip-370.md: ########## @@ -0,0 +1,58 @@ +# PIP-370: configurable remote topic creation in geo-replication + +# Background knowledge + +Users using Geo-Replication backup data across multiple clusters, as well as Admin APIs related to Geo-Replication and internal replicators of brokers, will trigger topics of auto-creation between clusters. +- For partitioned topics. + - After enabling namespace-level Geo-Replication: the broker will create topics on the remote cluster automatically when calling `pulsar-admin topics create-partitioned-topic`. It does not depend on enabling `allowAutoTopicCreation`. + - When enabling topic-level Geo-Replication on a partitioned topic: the broker will create topics on the remote cluster automatically. It does not depend on enabling `allowAutoTopicCreation`. +- For non-partitioned topics and partitions of partitioned topics. + - The internal Geo-Replicator will trigger topics auto-creation for remote clusters. **(Highlight)** It depends on enabling `allowAutoTopicCreation`. In fact, this behavior is not related to Geo-Replication, it is the behavior of the internal producer of Geo-Replicator, + +# Motivation + +In the following scenarios, automatic topic creation across clusters is problematic due to race conditions during deployments, and there is no choice that prevents pulsar resource creation affects each other between clusters. + +- Users want to maintain pulsar resources manually. +- Users pulsar resources using `GitOps CD` automated deployment, for which + - Clusters are deployed simultaneously without user intervention. + - Each cluster is precisely configured from git repo config variables - including the list of all tenants/namespaces/topics to be created in each cluster. + - Clusters are configured to be exact clones of each other in terms of pulsar resources. + +**Passed solution**: disable `allowAutoTopicCreation`, the APIs `pulsar-admin topics create-partitioned-topic` still create topics on the remote cluster when enabled namespace level replication, the API `enable topic-level replication` still create topics, And the internal replicator will keep printing error logs due to a not found error. + +# Goals + +Introduce a flag to disable the replicators to automatically trigger topic creation. + +# Detailed Design + +## Configuration + +**broker.conf** +```properties +# It is not a dynamic config, the default value is "true" to preserve backward-compatible behavior. +# See details below. +replicationTriggerRemoteTopicCreation=true Review Comment: I would like suggest to change to `createTopicToRemoteClusterForReplication`. ########## pip/pip-370.md: ########## @@ -0,0 +1,58 @@ +# PIP-370: configurable remote topic creation in geo-replication + +# Background knowledge + +Users using Geo-Replication backup data across multiple clusters, as well as Admin APIs related to Geo-Replication and internal replicators of brokers, will trigger topics of auto-creation between clusters. +- For partitioned topics. + - After enabling namespace-level Geo-Replication: the broker will create topics on the remote cluster automatically when calling `pulsar-admin topics create-partitioned-topic`. It does not depend on enabling `allowAutoTopicCreation`. + - When enabling topic-level Geo-Replication on a partitioned topic: the broker will create topics on the remote cluster automatically. It does not depend on enabling `allowAutoTopicCreation`. +- For non-partitioned topics and partitions of partitioned topics. + - The internal Geo-Replicator will trigger topics auto-creation for remote clusters. **(Highlight)** It depends on enabling `allowAutoTopicCreation`. In fact, this behavior is not related to Geo-Replication, it is the behavior of the internal producer of Geo-Replicator, + +# Motivation + +In the following scenarios, automatic topic creation across clusters is problematic due to race conditions during deployments, and there is no choice that prevents pulsar resource creation affects each other between clusters. + +- Users want to maintain pulsar resources manually. +- Users pulsar resources using `GitOps CD` automated deployment, for which + - Clusters are deployed simultaneously without user intervention. + - Each cluster is precisely configured from git repo config variables - including the list of all tenants/namespaces/topics to be created in each cluster. + - Clusters are configured to be exact clones of each other in terms of pulsar resources. + +**Passed solution**: disable `allowAutoTopicCreation`, the APIs `pulsar-admin topics create-partitioned-topic` still create topics on the remote cluster when enabled namespace level replication, the API `enable topic-level replication` still create topics, And the internal replicator will keep printing error logs due to a not found error. + +# Goals + +Introduce a flag to disable the replicators to automatically trigger topic creation. + +# Detailed Design + +## Configuration + +**broker.conf** +```properties +# It is not a dynamic config, the default value is "true" to preserve backward-compatible behavior. +# See details below. +replicationTriggerRemoteTopicCreation=true +``` + +## Design & Implementation Details + +- If `replicationTriggerRemoteTopicCreation` is set to `false`. + 1. After enabling namespace-level Geo-Replication: the broker will not create topics on the remote cluster automatically when calling `pulsar-admin topics create-partitioned-topic`. + 2. When enabling topic-level Geo-Replication on a partitioned topic: broker will not create topics on the remote cluster automatically. + 3. The internal Geo-Replicator will not trigger topic auto-creation for remote clusters, it just keeps retrying to check if the topic exists on the remote cluster, once the topic is created, the replicator starts. + 4. It does not change the behavior that creating subscriptions after enabling `enableReplicatedSubscriptions`, the subscription will also be created on the remote cluster after users enable. `enableReplicatedSubscriptions`. + 5. The config `allowAutoTopicCreation` still works for the local cluster as before, it will not be affected by the new config `replicationTriggerRemoteTopicCreation`. +- If `replicationTriggerRemoteTopicCreation` is set to `true`. + a. All components work as before. + +# Backward & Forward Compatibility + +The feature that disables `replicationTriggerRemoteTopicCreation` depends on the API `PulsarClient.getPartitionsForTopic(String topic, boolean metadataAutoCreationEnabled)`, which was introduced by [PIP-344](https://github.com/apache/pulsar/blob/master/pip/pip-344.md). + Review Comment: PIP-370 requires the changes from PIP-344 but it's not related to compatibility, right? The changes introduced by PIP-370 will not change the default behavior expect the BUG fixes. We can ensure the full compatibility. Users need to disable `replicationTriggerRemoteTopicCreation` manually and they can also back to the default behavior by enabling the `replicationTriggerRemoteTopicCreation`. ########## pip/pip-370.md: ########## @@ -0,0 +1,58 @@ +# PIP-370: configurable remote topic creation in geo-replication + +# Background knowledge + +Users using Geo-Replication backup data across multiple clusters, as well as Admin APIs related to Geo-Replication and internal replicators of brokers, will trigger topics of auto-creation between clusters. +- For partitioned topics. + - After enabling namespace-level Geo-Replication: the broker will create topics on the remote cluster automatically when calling `pulsar-admin topics create-partitioned-topic`. It does not depend on enabling `allowAutoTopicCreation`. + - When enabling topic-level Geo-Replication on a partitioned topic: the broker will create topics on the remote cluster automatically. It does not depend on enabling `allowAutoTopicCreation`. +- For non-partitioned topics and partitions of partitioned topics. + - The internal Geo-Replicator will trigger topics auto-creation for remote clusters. **(Highlight)** It depends on enabling `allowAutoTopicCreation`. In fact, this behavior is not related to Geo-Replication, it is the behavior of the internal producer of Geo-Replicator, + +# Motivation + +In the following scenarios, automatic topic creation across clusters is problematic due to race conditions during deployments, and there is no choice that prevents pulsar resource creation affects each other between clusters. + +- Users want to maintain pulsar resources manually. +- Users pulsar resources using `GitOps CD` automated deployment, for which + - Clusters are deployed simultaneously without user intervention. + - Each cluster is precisely configured from git repo config variables - including the list of all tenants/namespaces/topics to be created in each cluster. + - Clusters are configured to be exact clones of each other in terms of pulsar resources. + +**Passed solution**: disable `allowAutoTopicCreation`, the APIs `pulsar-admin topics create-partitioned-topic` still create topics on the remote cluster when enabled namespace level replication, the API `enable topic-level replication` still create topics, And the internal replicator will keep printing error logs due to a not found error. + +# Goals + +Introduce a flag to disable the replicators to automatically trigger topic creation. + +# Detailed Design + +## Configuration + +**broker.conf** +```properties +# It is not a dynamic config, the default value is "true" to preserve backward-compatible behavior. +# See details below. +replicationTriggerRemoteTopicCreation=true +``` + +## Design & Implementation Details + +- If `replicationTriggerRemoteTopicCreation` is set to `false`. + 1. After enabling namespace-level Geo-Replication: the broker will not create topics on the remote cluster automatically when calling `pulsar-admin topics create-partitioned-topic`. + 2. When enabling topic-level Geo-Replication on a partitioned topic: broker will not create topics on the remote cluster automatically. + 3. The internal Geo-Replicator will not trigger topic auto-creation for remote clusters, it just keeps retrying to check if the topic exists on the remote cluster, once the topic is created, the replicator starts. + 4. It does not change the behavior that creating subscriptions after enabling `enableReplicatedSubscriptions`, the subscription will also be created on the remote cluster after users enable. `enableReplicatedSubscriptions`. + 5. The config `allowAutoTopicCreation` still works for the local cluster as before, it will not be affected by the new config `replicationTriggerRemoteTopicCreation`. +- If `replicationTriggerRemoteTopicCreation` is set to `true`. + a. All components work as before. Review Comment: We'd better add the details for the existing behavior. Otherwise the reviewer who is not very familiar with the existing behavior, it's not easy to understand the difference between `replicationTriggerRemoteTopicCreation` enabled and disabled. ########## pip/pip-370.md: ########## @@ -0,0 +1,58 @@ +# PIP-370: configurable remote topic creation in geo-replication + +# Background knowledge + +Users using Geo-Replication backup data across multiple clusters, as well as Admin APIs related to Geo-Replication and internal replicators of brokers, will trigger topics of auto-creation between clusters. +- For partitioned topics. + - After enabling namespace-level Geo-Replication: the broker will create topics on the remote cluster automatically when calling `pulsar-admin topics create-partitioned-topic`. It does not depend on enabling `allowAutoTopicCreation`. + - When enabling topic-level Geo-Replication on a partitioned topic: the broker will create topics on the remote cluster automatically. It does not depend on enabling `allowAutoTopicCreation`. +- For non-partitioned topics and partitions of partitioned topics. + - The internal Geo-Replicator will trigger topics auto-creation for remote clusters. **(Highlight)** It depends on enabling `allowAutoTopicCreation`. In fact, this behavior is not related to Geo-Replication, it is the behavior of the internal producer of Geo-Replicator, + +# Motivation + +In the following scenarios, automatic topic creation across clusters is problematic due to race conditions during deployments, and there is no choice that prevents pulsar resource creation affects each other between clusters. + +- Users want to maintain pulsar resources manually. +- Users pulsar resources using `GitOps CD` automated deployment, for which + - Clusters are deployed simultaneously without user intervention. + - Each cluster is precisely configured from git repo config variables - including the list of all tenants/namespaces/topics to be created in each cluster. + - Clusters are configured to be exact clones of each other in terms of pulsar resources. + +**Passed solution**: disable `allowAutoTopicCreation`, the APIs `pulsar-admin topics create-partitioned-topic` still create topics on the remote cluster when enabled namespace level replication, the API `enable topic-level replication` still create topics, And the internal replicator will keep printing error logs due to a not found error. + +# Goals + +Introduce a flag to disable the replicators to automatically trigger topic creation. + +# Detailed Design + +## Configuration + +**broker.conf** +```properties +# It is not a dynamic config, the default value is "true" to preserve backward-compatible behavior. +# See details below. +replicationTriggerRemoteTopicCreation=true +``` + +## Design & Implementation Details + +- If `replicationTriggerRemoteTopicCreation` is set to `false`. + 1. After enabling namespace-level Geo-Replication: the broker will not create topics on the remote cluster automatically when calling `pulsar-admin topics create-partitioned-topic`. Review Comment: Will the topic for the remote cluster created during the topic creation on the local cluster? Is it better to move to the replicator. It can check the topic in the local cluster and remote cluster and then decide to create the topic to the remote cluster or not. In this way, for both partitioned topic and non-partitioned topic can follow the same tactic. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
