[ 
https://issues.apache.org/jira/browse/KAFKA-10287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17164009#comment-17164009
 ] 

John Roesler commented on KAFKA-10287:
--------------------------------------

>From [~cadonna] 's PR:

> In PR [#8962|https://github.com/apache/kafka/pull/8962] we introduced a 
> sentinel UNKNOWN_OFFSET to mark unknown offsets in checkpoint files. The 
> sentinel was set to -2 which is the same value used for the sentinel 
> LATEST_OFFSET that is used in subscriptions to signal that state stores have 
> been used by an active task. Unfortunately, we missed to skip UNKNOWN_OFFSET 
> when we compute the sum of the changelog offsets.

> If a task had only one state store and it did not restore anything before the 
> next rebalance, the stream thread wrote -2 (i.e., UNKNOWN_OFFSET) into the 
> subscription as sum of the changelog offsets. During assignment, the leader 
> interpreted the -2 as if the stream run the task as active although it might 
> have run it as standby. This misinterpretation of the sentinel value resulted 
> in unexpected task assigments.

This seems like a blocker to me, [~rhauch] , what do you think? The condition 
is that the streams assignor would unexpectedly move work to a node that's not 
ready, resulting in downtime while that node restores from the changelog.

> fix flaky streams/streams_standby_replica_test.py
> -------------------------------------------------
>
>                 Key: KAFKA-10287
>                 URL: https://issues.apache.org/jira/browse/KAFKA-10287
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams, system tests
>            Reporter: Chia-Ping Tsai
>            Assignee: Chia-Ping Tsai
>            Priority: Blocker
>             Fix For: 2.6.0
>
>
> {quote}
> Module: kafkatest.tests.streams.streams_standby_replica_test
> Class:  StreamsStandbyTask
> Method: test_standby_tasks_rebalance
> {quote}
> It pass occasionally on my local.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to