gharris1727 opened a new pull request #9765:
URL: https://github.com/apache/kafka/pull/9765


   When two cooperative rebalances take place soon after one another, a prior 
rebalance may not complete before the next rebalance is started.
   Under Eager rebalancing, no tasks would have been started, so the subsequent 
onRevoked call is intentionally skipped whenever rebalanceResolved was false.
   Under Cooperative rebalancing, the same logic causes the DistributedHerder 
to skip stopping all of the connector/task revocations which occur in the 
second rebalance.
   The DistributedHerder still removes the revoked connectors/tasks from its 
assignment, so that the DistributedHerder and Worker have different knowledge 
of running connectors/tasks.
   This causes the connector/task instances that would have been stopped to 
disappear from the rebalance protocol, and left running until their workers are 
halted, or they fail.
   Connectors/Tasks which were then reassigned to other workers by the 
rebalance protocol would be duplicated, and run concurrently with zombie 
connectors/tasks.
   Connectors/Tasks which were reassigned back to the same worker would 
encounter exceptions in Worker, indicating that the connector/task existed and 
was already running.
   
   * Add a test for revoking and then reassigning a connector under normal 
circumstances
   * Add a test for revoking and then reassigning a connector following an 
incomplete cooperative rebalance
   * Change expectRebalance to make assignment fields mutable before passing 
them into the DistributedHerder
   * Only skip revocation for the Eager protocol, and never skip revocation for 
cooperative/sessioned protocols
   
   Signed-off-by: Greg Harris <gr...@confluent.io>
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to