[ 
https://issues.apache.org/jira/browse/KAFKA-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16567529#comment-16567529
 ] 

Matthias J. Sax commented on KAFKA-4479:
----------------------------------------

Just re-reading the description: Streams does not send a "leave group request" 
in close() – this is an internal setting to allow for rolling bounce upgrades 
without state migration. Seems to be related. However, I haven't seen any 
failure of `QueryableStateIntegrationTest` for quite some time. Maybe we can 
close this a cannot reproduce?

> Streams tests should pass without hardcoded Time.SYSTEM in GroupCoordinator
> ---------------------------------------------------------------------------
>
>                 Key: KAFKA-4479
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4479
>             Project: Kafka
>          Issue Type: Improvement
>          Components: streams, unit tests
>            Reporter: Ismael Juma
>            Priority: Minor
>              Labels: newbie, newbie++
>         Attachments: test.zip
>
>
> If we pass `KafkaServer.time` to `GroupCoordinator`[1], some streams tests 
> like QueryableStateIntegrationTest fail sem-regularly. [~damianguy] looked 
> into it and described it as:
> {quote}
> Looking at the sequence of events, one thread is stopped, and hence leaves 
> the group triggering a rebalance, but the other thread doesn’t seem to get 
> the memo, tries to commit, fails, and then game-over.
> So.. the case that it fails the one alive thread is not getting a rebalance. 
> This would happen during  a `poll(..)` right? However i can see the thread is 
> polling many times after the other thread has shutdown.
> It tries to commit every time around the loop, so:
> poll(..)
> process(..)
> maybeCommit(..)
> and there is like < 10ms between calls to `poll`.
> {quote}
> A theory was that the mock time was not advancing enough to trigger a 
> rebalance in the group coordinator. However, the consumer is closed, so that 
> should trigger a `LeaveGroup` request and it's unclear why a rebalance is not 
> triggered for the live consumer.
> PR where this issue was first seen and discussed: 
> https://github.com/apache/kafka/pull/2095
> [1] 
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/KafkaServer.scala#L222



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to