apalan60 commented on code in PR #20772:
URL: https://github.com/apache/kafka/pull/20772#discussion_r2478180201
##########
tests/kafkatest/tests/client/consumer_test.py:
##########
@@ -74,6 +75,37 @@ def setup_consumer(self, topic, **kwargs):
self.mark_for_collect(consumer, 'verifiable_consumer_stdout')
return consumer
+ def await_conflict_consumers_fenced(self, conflict_consumer):
+ # Ensure every conflicting consumer actually starts once before we
wait for fencing.
+ started_nodes = set()
+ def all_conflict_consumers_started():
+ for node in conflict_consumer.alive_nodes():
+ started_nodes.add(node)
+ return len(conflict_consumer.alive_nodes()) ==
len(conflict_consumer.nodes)
+
+ wait_until(all_conflict_consumers_started,
Review Comment:
Thanks for the reminder.
`conflict_consumer.start()` runs asynchronously. The detection of a
duplicate `groupId` and the subsequent `UnreleasedInstanceIdException` that
interrupts the consumer process both occur during this phase.
If the newly added validation method
`await_conflict_consumers_fenced(conflict_consumer)` is executed **before** the
conflict consumer is interrupted, `wait_until(all_conflict_consumers_started)`
will time out since it never sees all consumers started, leading to the test
failure.
In my environment, adding a 2-second sleep before the validation step can
reliably reproduce the issue.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]