Hi all, I'd like to raise a discussion around *[KAFKA-19070]*( https://issues.apache.org/jira/browse/KAFKA-19070), which proposes a small but impactful change to how Kafka Connect handles the `client.id` configuration when set explicitly by the user.
*Problem* Currently, if a user sets a custom `client.id` via the connector configuration (e.g., `client.id=custom-id`), this value is inherited **as-is** by **all tasks** of the connector. While this does not break functionality, it leads to a few critical issues: - *Metric registration conflicts*: Since Kafka metrics (like `consumer-metrics`, `fetch-manager-metrics`, etc.) use `client.id` as part of their identity, using the same ID across multiple tasks causes metrics to collide or get overwritten. - *Observability and debugging challenges*: Logs and metrics are merged across tasks, making it harder to trace task-specific behavior. *Proposal* The proposed change (PR [#19341](https://github.com/apache/kafka/pull/19341)) appends the *task number* to the user-provided `client.id` to ensure uniqueness per task. For example: - User configures: `client.id=my-sink` - Task 2 uses: `client.id=my-sink-2` *This approach:* - Respects the user's intent - Guarantees uniqueness - Brings Connect behavior in line with Kafka Streams and other components that generate per-client IDs *Another approach:* There is another alternate way of ensuring unique client-id for each task as follows. The configuration we provide through the POST/PUT API is the connector-level config. However, the task-level config is generated by the Connect framework via the kafka/connect/api/src/main/java/org/apache/kafka/connect/connector/Connector.java <https://github.com/apache/kafka/blob/2a7457f2dd95f0732562ae0708b5162e8c4a3a6d/connect/api/src/main/java/org/apache/kafka/connect/connector/Connector.java#L124> Line 124 in 2a7457f <https://github.com/apache/kafka/commit/2a7457f2dd95f0732562ae0708b5162e8c4a3a6d> public abstract List<Map<String, String>> taskConfigs(int maxTasks); method. This method returns a list of configs—one per task—which are then used to instantiate the tasks. Since we want each task to have a unique client-id, we can modify the value at this point (inside taskConfigs(...)) by appending the task number. *So as part of this approach we can provide a default implementation for the taskConfigs method which adds the task number in the end to avoid client-id collision.* *Potential Concern* One concern raised is that some users may rely on `client.id` for **authorization** or **quotas**. In such cases, appending the task number might introduce unexpected behavior, even though `client.id` is not used for partitioning, delivery, or offset tracking. *Questions* - Would this change qualify as a *bug fix*, or does it require a *KIP* due to its potential impact? - Are there known real-world scenarios where modifying `client.id` breaks compatibility? - Would it be better to make this opt-in via a config flag? Happy to revise the implementation depending on community feedback. Thanks in advance for your thoughts! Best regards, Pritam Kumar Mishra