[ 
https://issues.apache.org/jira/browse/KAFKA-13764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17519733#comment-17519733
 ] 

Chris Egerton edited comment on KAFKA-13764 at 4/8/22 5:26 PM:
---------------------------------------------------------------

While reviewing the existing rebalancing logic and implementing some of the 
proposed changes here, it became apparent that the first suggestion of 
assigning new connectors and tasks before perform load-balancing revocations 
would be more complicated than initially intended. An easier alternative is to 
keep the order of first performing load-balancing revocations and then 
assigning new connectors and tasks, but to perform load-balancing revocations 
based on the complete set of configured connectors and tasks on the cluster 
(instead of the set of currently-running connectors and tasks on the cluster), 
and to take those revocations into account when assigning new connectors and 
tasks later on.

 

The complication with the initial proposal is that it would require tracking 
which connectors and tasks were newly-assigned to each worker when performing 
load-balancing revocation, so that they would not be part of those revocations.


was (Author: chrisegerton):
While reviewing the existing rebalancing logic and implementing some of the 
proposed changes here, it became apparent that the first suggestion of 
assigning new connectors and tasks before perform load-balancing revocations 
would be more complicated than initially intended. An easier alternative is to 
keep the order of first performing load-balancing revocations and then 
assigning new connectors and tasks, but to perform load-balancing revocations 
based on the complete set of configured connectors and tasks on the cluster 
(instead of the set of currently-running connectors and tasks on the cluster), 
and to take those revocations into account when assigning new connectors and 
tasks later on.

> Potential improvements for Connect incremental rebalancing logic
> ----------------------------------------------------------------
>
>                 Key: KAFKA-13764
>                 URL: https://issues.apache.org/jira/browse/KAFKA-13764
>             Project: Kafka
>          Issue Type: Improvement
>          Components: KafkaConnect
>            Reporter: Chris Egerton
>            Assignee: Chris Egerton
>            Priority: Minor
>
> There are a few small changes that we might make to the incremental 
> rebalancing logic for Kafka Connect to improve distribution of connectors and 
> tasks across a cluster and address potential bugs:
>  # During assignment, assign new connectors and tasks across the cluster 
> before calculating revocations that may be necessary in order to balance the 
> cluster. This way, we can potentially skip a round of revocation by using 
> newly-created connectors and tasks to balance out the cluster.
>  # Perform connector and task revocation in more cases, such as when one or 
> more connectors are reconfigured to use fewer tasks, which can possibly lead 
> to an imbalanced cluster.
>  # Fix [this 
> line|https://github.com/apache/kafka/blob/06ca4850c5b2b12e972f48e03fe4f9c1032f9a3e/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/distributed/IncrementalCooperativeAssignor.java#L248]
>  to use the same aggregation logic that's used 
> [here|https://github.com/apache/kafka/blob/06ca4850c5b2b12e972f48e03fe4f9c1032f9a3e/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/distributed/IncrementalCooperativeAssignor.java#L273-L281]
>  in order to avoid overwriting map values when they should be combined 
> instead.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to