[ https://issues.apache.org/jira/browse/KAFKA-13296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sagar Rao resolved KAFKA-13296. ------------------------------- Resolution: Fixed > Verify old assignment within StreamsPartitionAssignor > ----------------------------------------------------- > > Key: KAFKA-13296 > URL: https://issues.apache.org/jira/browse/KAFKA-13296 > Project: Kafka > Issue Type: Improvement > Components: streams > Reporter: Matthias J. Sax > Assignee: Sagar Rao > Priority: Major > > `StreamsPartitionAssignor` is responsible to assign partitions and tasks to > all StreamsThreads within an application. > While it ensures to not assign a single partition/task to two threads, there > is limited verification about it. In particular, we had one incident for with > a zombie thread/consumer did not cleanup its own internal state correctly due > to KAFKA-12983. This unclean zombie-state implied that the _old assignment_ > reported to `StreamsPartitionAssignor` contained a single partition for two > consumers. As a result, both threads/consumers later revoked the same > partition and the zombie-thread could commit it's unclean work (even if it > should have been fenced), leading to duplicate output under EOS_v2. > We should consider to add a check to `StreamsPartitionAssignor` if the _old > assignment_ is valid, ie, no partition should be missing and no partition > should be assigned to two consumers. For this case, we should log the invalid > _old assignment_ and send an error code back to all consumer that indicates > that they should shut down "unclean" (ie, without and flushing and no > committing any offsets or transactions). -- This message was sent by Atlassian Jira (v8.20.10#820010)