apalan60 commented on code in PR #20772:
URL: https://github.com/apache/kafka/pull/20772#discussion_r2478180201


##########
tests/kafkatest/tests/client/consumer_test.py:
##########
@@ -74,6 +75,37 @@ def setup_consumer(self, topic, **kwargs):
         self.mark_for_collect(consumer, 'verifiable_consumer_stdout')
         return consumer
 
+    def await_conflict_consumers_fenced(self, conflict_consumer):
+        # Ensure every conflicting consumer actually starts once before we 
wait for fencing.
+        started_nodes = set()
+        def all_conflict_consumers_started():
+            for node in conflict_consumer.alive_nodes():
+                started_nodes.add(node)
+            return len(conflict_consumer.alive_nodes()) == 
len(conflict_consumer.nodes)
+
+        wait_until(all_conflict_consumers_started,

Review Comment:
   Thanks for the reminder.  
   
   `conflict_consumer.start()` runs asynchronously. The detection of a 
duplicate `groupId` and the subsequent `UnreleasedInstanceIdException` that 
interrupts the consumer process both occur during this phase.  
   
   If the newly added validation method 
`await_conflict_consumers_fenced(conflict_consumer)` is executed **before** the 
conflict consumer is interrupted, `wait_until(all_conflict_consumers_started)` 
will time out since it never sees all consumers started, leading to the test 
failure.  
   
   In my environment, adding a 2-second sleep before the validation step can 
reliably reproduce the issue.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to