Pritam Kumar created KAFKA-19211:
------------------------------------
Summary: Connect Storage Topics Sharing Across Clusters
Key: KAFKA-19211
URL: https://issues.apache.org/jira/browse/KAFKA-19211
Project: Kafka
Issue Type: Improvement
Components: connect
Reporter: Pritam Kumar
Setting up a *Kafka Connect* cluster requires provisioning three internal
topics:
# *Offset Storage Topic* – Tracks offsets of source connectors.
# *Status Storage Topic* – Maintains the state of connectors and tasks.
# *Config Storage Topic* – Stores connector configurations.
This *design choice* simplifies migration—enabling seamless replication of
management topics across regions. However, it introduces {*}operational
overhead{*}:
* {*}Every new cluster requires three new topics{*}, leading to an exponential
increase in topic creation.
* {*}Cross-team dependencies slow down provisioning{*}, delaying deployments.
* {*}Compacted topics trigger frequent disk cleanup operations{*}, adding
maintenance complexity.
While each cluster only requires {*}three topics{*}, their cumulative impact
grows significantly as more connect clusters are deployed.
But as these topics have very light traffic and are compacted, instead of
provisioning dedicated topics for every cluster, Kafka Connect clusters can
*share internal topics* across multiple deployments. This brings {*}immediate
benefits{*}:
* *Drastically Reduces Topic Proliferation* – Eliminates unnecessary topic
creation.
* *Faster Kafka Connect Cluster Deployment* – No waiting for new topic
provisioning.
** *Large Enterprises with Multiple Teams Using Kafka Connect*
* *Scenario:* In large organisations, multiple teams manage different *Kafka
Connect clusters* for various data pipelines.
* *Benefit:* Instead of waiting for new *internal topics* to be provisioned
each time a new cluster is deployed, teams can *immediately start* using
pre-existing shared topics, reducing lead time and improving efficiency.
** *Cloud-Native & Kubernetes-Based Deployments*
* *Scenario:* Many organisations deploy Kafka Connect in *containerised
environments* (e.g., Kubernetes), where clusters are frequently *scaled
up/down* or *recreated* dynamically.
* *Benefit:* Since internal topics are already available, new clusters can
{*}spin up instantly{*}, without waiting for *topic provisioning* or {*}Kafka
ACL approvals{*}.
* How this will help different organisations:
* *Lower Operational Load* – Reduces disk-intensive cleanup operations.
** Broker resource utilization is expected to decrease by approximately 20%,
primarily due to reduced partition count and metadata overhead. This
optimization can enable further cluster downscaling, contributing directly to
lower infrastructure costs (e.g., fewer brokers, reduced EBS storage footprint,
and lower I/O throughput).
** Administrative overhead and monitoring complexity are projected to reduce
by 30%, due to:
* Fewer topics to configure, monitor, and apply retention/compaction policies
to.
* Reduced rebalancing operations during cluster scale-in or scale-out events.
* Minimized data movement and replication workload in failure scenarios (e.g.,
EBS volume replacement), as fewer partitions are involved.
* *Considering the current load*
* *Simplified Management* – Less overhead in monitoring and maintaining
internal topics.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)