lianetm commented on code in PR #15778: URL: https://github.com/apache/kafka/pull/15778#discussion_r1581203195
########## tests/kafkatest/tests/client/consumer_test.py: ########## @@ -242,16 +242,15 @@ def test_static_consumer_bounce(self, clean_shutdown, static_membership, bounce_ self.rolling_bounce_consumers(consumer, keep_alive=num_keep_alive, num_bounces=num_bounces) num_revokes_after_bounce = consumer.num_revokes_for_alive() - num_revokes_before_bounce - - check_condition = num_revokes_after_bounce != 0 + # under static membership, the live consumer shall not revoke any current running partitions, # since there is no global rebalance being triggered. if static_membership: - check_condition = num_revokes_after_bounce == 0 - - assert check_condition, \ - "Total revoked count %d does not match the expectation of having 0 revokes as %d" % \ - (num_revokes_after_bounce, check_condition) + assert num_revokes_after_bounce == 0, \ + "Unexpected revocation triggered when bouncing static member. Expecting 0 but had %d revocations" % num_revokes_after_bounce + elif consumer.is_eager(): + assert num_revokes_after_bounce != 0, \ Review Comment: > So maybe we should instead try to get the set of partitions and check that it didn't change? Doable, but that wouldn't truly ensure the static membership behaviour either I guess, because the test is intentionally leaving 1 member alive that could be the leader or not. So the assignment would remain the same regardless of the static membership under CONSUMER (or Cooperative) if the single partition belongs to the live member (single partition that I guess was intentionally decided to be able to easily check the delivery semantics after the bounces) So as I see it this test is specifically crafted to validate the stickiness that static membership brings into the `RangeAssignor` ( nicely explained in [this section](https://github.com/apache/kafka/blob/e7792258df934a5c8470c2925c5d164c7d5a8e6c/clients/src/main/java/org/apache/kafka/clients/consumer/RangeAssignor.java#L58-L78) of the `RangeAssignor` class doc btw). We're trying to apply it to the CONSUMER protocol but finding not much value given the purpose of the test and the shape it has (bounce n-1 nodes, check at least 1 revocation if dynamic, none if static, regardless of partition ownership). Listening to myself telling you this makes me reconsider if we should just not run this test for the new protocol, as it was truly never intended or run for CooperativeAssignor? (I would probably rename it to something like `test_eager_stickiness_on_static_consumer_bounce` , make the use of RangeAssignor explicit, and then it looks clearer that we don't want to run such test on Cooper ativeAssignor or CONSUMER protocol. We do have other tests that ensure that static membership behaves as expected for the new protocol (ex. `test_fencing_static_consumer`), but agreed that the "owned partition not re-assigned for a static member that is bounced" is not covered in sys tests (not for CONSUMER, not for Cooperative either). We could think of a new test to cover that. The shape would be different I expect, because it would either have to rely on bouncing a member with assignment while having others with none, or ensuring that all members have at least 1 partition. It would also need a different, more complex delivery semantics validation, if any. I would just suggest a different Jira/PR for a new test, to be able to finalize migrating the current sys tests that apply to the new protocol. Makes sense? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org