[ 
https://issues.apache.org/jira/browse/KAFKA-9989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17108076#comment-17108076
 ] 

Bruno Cadonna commented on KAFKA-9989:
--------------------------------------

We do not really check for an empty assignment in this system test. We stop one 
instance, wait until the other two rebalanced, restart the stopped instance 
with another version and verify if it has processed some records. To verify the 
processing of records, the restarted instance requires an non-empty assignment.

Admittedly, we could verify differently whether the last rebalance was 
successful and the restarted instance is part of the group. However, I would 
not think of system tests the same way as of unit tests. System tests should 
test usage scenarios and verify that everything works as expected independently 
of the possible errors that may happen. Unit tests test a specific code path 
(valid case or error case).

I would argue that processing of records by the re-started instance is a legit 
expectation.

WDYT?

For the record, it was the following PR that resolved the failing system test:

https://github.com/apache/kafka/pull/8590

In the KIP-441 implementation, we used to re-assign the previous assignment if 
it was valid as an optimization, but we missed to verify whether all clients 
got at least one task assigned. We removed the whole optimization, because it 
turned out to have also other drawbacks.  

> StreamsUpgradeTest.test_metadata_upgrade could not guarantee all processor 
> gets assigned task
> ---------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-9989
>                 URL: https://issues.apache.org/jira/browse/KAFKA-9989
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams, system tests
>            Reporter: Boyang Chen
>            Priority: Major
>
> System test StreamsUpgradeTest.test_metadata_upgrade could fail due to:
> "Never saw output 'processed [0-9]* records' on ubuntu@worker6"
> which if we take a closer look at, the rebalance happens but has no task 
> assignment. We should fix this problem by making the rebalance result as part 
> of the check, and skip the record processing validation when the assignment 
> is empty. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to