[ https://issues.apache.org/jira/browse/KAFKA-8755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16912529#comment-16912529 ]
Chris Pettitt commented on KAFKA-8755: -------------------------------------- Quick update (for my own sanity): # Kafka Streams team was a huge help yesterday when we discussed expected behavior and some ideas on how to achieve it. # KAFKA-8816 is definitely required for this because we end up checkpointing the same offset over and over # I made the proposed change of updating the offset limit for standby tasks when commit is called, which does get the standby task moving ## BUG 1: Unfortunately we appear to need update offset limit for the active task or we get "WARN Detected out-of-order KTable update for source-table at offset 0, partition 0.". It is sufficient to call update offset limit once for the active task. ## BUG 2: We seem to call update offset limits in a pretty tight loop during failure / re-assignment. ## BUG 3: We're updating the changelog offset on the standby task just fine until we hit the active task failure, then we immediately write a checkpoint for offset 0 :). So we're still back to square 1 at the moment. ## BUG 4 (?): Maybe an issue: during restore we claim we're going to restore to the end offset, which is ahead of the committed offset. Somewhere in the call chain this gets fixed though. Still the logging is misleading. ## Cleanup: We have a few tests relying on updateOffsetLimit to trigger a side-effect that throws an exception. Need to rework those tests to trigger the exception in another way. > Stand-by Task of an Optimized Source Table Does Not Write Anything to its > State Store > ------------------------------------------------------------------------------------- > > Key: KAFKA-8755 > URL: https://issues.apache.org/jira/browse/KAFKA-8755 > Project: Kafka > Issue Type: Bug > Components: streams > Affects Versions: 2.4.0 > Reporter: Bruno Cadonna > Assignee: Chris Pettitt > Priority: Major > Labels: newbie > Attachments: StandbyTaskTest.java > > > With the following topology: > {code:java} > builder.table( > INPUT_TOPIC, > Consumed.with(Serdes.Integer(), Serdes.Integer()), > Materialized.<Integer, Integer, KeyValueStore<Bytes, byte[]>>as(stateName) > ) > {code} > and with topology optimization turned on, Kafka Streams uses the input topic > {{INPUT_TOPIC}} as the change log topic for state store {{stateName}}. A > stand-by task for such a topology should read from {{INPUT_TOPIC}} and should > write the records to its state store so that the streams client that runs the > stand-by task can take over the execution of the topology in case of a > failure with an up-to-date replica of the state. > Currently, the stand-by task described above reads from the input topic but > does not write the records to its state store. Thus, after a failure the > stand-by task cannot provide any up-to-date state store and the streams > client needs to construct the state from scratch before it can take over the > execution. > The described behaviour can be reproduced with the attached test. -- This message was sent by Atlassian Jira (v8.3.2#803003)