[ https://issues.apache.org/jira/browse/KAFKA-9989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17108076#comment-17108076 ]
Bruno Cadonna commented on KAFKA-9989: -------------------------------------- We do not really check for an empty assignment in this system test. We stop one instance, wait until the other two rebalanced, restart the stopped instance with another version and verify if it has processed some records. To verify the processing of records, the restarted instance requires an non-empty assignment. Admittedly, we could verify differently whether the last rebalance was successful and the restarted instance is part of the group. However, I would not think of system tests the same way as of unit tests. System tests should test usage scenarios and verify that everything works as expected independently of the possible errors that may happen. Unit tests test a specific code path (valid case or error case). I would argue that processing of records by the re-started instance is a legit expectation. WDYT? For the record, it was the following PR that resolved the failing system test: https://github.com/apache/kafka/pull/8590 In the KIP-441 implementation, we used to re-assign the previous assignment if it was valid as an optimization, but we missed to verify whether all clients got at least one task assigned. We removed the whole optimization, because it turned out to have also other drawbacks. > StreamsUpgradeTest.test_metadata_upgrade could not guarantee all processor > gets assigned task > --------------------------------------------------------------------------------------------- > > Key: KAFKA-9989 > URL: https://issues.apache.org/jira/browse/KAFKA-9989 > Project: Kafka > Issue Type: Bug > Components: streams, system tests > Reporter: Boyang Chen > Priority: Major > > System test StreamsUpgradeTest.test_metadata_upgrade could fail due to: > "Never saw output 'processed [0-9]* records' on ubuntu@worker6" > which if we take a closer look at, the rebalance happens but has no task > assignment. We should fix this problem by making the rebalance result as part > of the check, and skip the record processing validation when the assignment > is empty. -- This message was sent by Atlassian Jira (v8.3.4#803005)