[
https://issues.apache.org/jira/browse/KAFKA-9848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Konstantine Karantasis resolved KAFKA-9848.
-------------------------------------------
Resolution: Fixed
> Avoid triggering scheduled rebalance delay when task assignment fails but
> Connect workers remain in the group
> -------------------------------------------------------------------------------------------------------------
>
> Key: KAFKA-9848
> URL: https://issues.apache.org/jira/browse/KAFKA-9848
> Project: Kafka
> Issue Type: Bug
> Components: KafkaConnect
> Affects Versions: 2.3.1, 2.5.0, 2.4.1
> Reporter: Konstantine Karantasis
> Assignee: Konstantine Karantasis
> Priority: Major
> Fix For: 2.3.2, 2.6.0, 2.4.2, 2.5.1
>
>
> There are cases where a Connect worker does not receive its tasks assignments
> successfully after a rebalance but will still remain in the group. For
> example when a SyncGroup response is lost, a worker will not get its expected
> assignments but will rejoin the group immediately and will trigger another
> rebalance.
> With incremental cooperative rebalancing, tasks assignments that are computed
> and sent by the leader but are not received by any of the members are marked
> as lost assignments in the subsequent rebalance. The presence of lost
> assignments activates the scheduled rebalance delay (property) and the
> missing tasks are not assigned until this delay expires.
> This situation can be improved in two cases:
> a) When it's the leader that failed to receive the new assignments from the
> broker coordinator (for example if the SyncGroup request or response was
> lost). If this worker remains the leader of the group in the subsequent
> rebalance round, it can detect that the previous assignment was not
> successfully applied by checking what's the expected generation.
> b) If one or more regular members did not receive their assignments
> successfully, but have joined the latest round of rebalancing, they can be
> assigned the tasks that remain unassigned from the previous assignment
> immediately without these tasks being marked as lost. The leader can detect
> that by checking that some tasks seem lost since the previous assignment but
> also the number of workers is unchanged between the two rounds of
> rebalancing. In this case, the leader can go ahead and assign the missing
> tasks as new tasks immediately.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)