[
https://issues.apache.org/jira/browse/KAFKA-4558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15800434#comment-15800434
]
Ewen Cheslack-Postava commented on KAFKA-4558:
----------------------------------------------
Number of partitions probably isn't reliable since some consumers could end up
with 0 assigned partitions (unless we always make sure tests have enough
partitions to go around). Maybe just a metric indicating whether the consumer
state, i.e. joining, member, etc? This is actually kind of tricky because we
don't just need the partitions assigned, we also need to make sure the we've
looked up and set the fetch offsets in the consumer. This might require
per-assigned-partition offset information -- maybe have a metric for list of
assigned partitions and then a metric for the offset (or lag) for each of them?
Given some issues with other tests we've seen, I think there are others that
have the same requirement. I'm fine with {{@ignore}}ing some tests if people
think that's valuable, but I think it'd be better to just try to get the fix in
asap since it'll be quite a bit of effort to evaluate all the tests using
ProduceConsumeValidate -- I see at least 13 or so in kafkatest.
> throttling_test fails if the producer starts too fast.
> ------------------------------------------------------
>
> Key: KAFKA-4558
> URL: https://issues.apache.org/jira/browse/KAFKA-4558
> Project: Kafka
> Issue Type: Bug
> Reporter: Apurva Mehta
> Assignee: Apurva Mehta
>
> As described in https://issues.apache.org/jira/browse/KAFKA-4526, the
> throttling test will fail if the producer in the produce-consume-validate
> loop starts up before the consumer is fully initialized.
> We need to block the start of the producer until the consumer is ready to go.
> The current plan is to poll the consumer for a particular metric (like, for
> instance, partition assignment) which will act as a good proxy for successful
> initialization. Currently, we just check for the existence of a process with
> the PID, which is not a strong enough check, causing the test to fail
> intermittently.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)