John Roesler created KAFKA-10260:
------------------------------------

             Summary: Streams could recover stores in a task independently
                 Key: KAFKA-10260
                 URL: https://issues.apache.org/jira/browse/KAFKA-10260
             Project: Kafka
          Issue Type: Improvement
          Components: streams
            Reporter: John Roesler


Currently, ProcessorStateManager checks for corrupted tasks by checking, for 
each persistent store, if its checkpoint is missing, then the task directory 
must also be empty.

This is a little overzealous, since we aren't checking whether the store's 
specific directory is nonempty, only if there are any directories for any 
stores. So if there are two stores in a task, and one is correctly written and 
checkpointed, while the other is neither written nor checkpointed, we _could_ 
correctly load the first and recover the second but instead we'll consider the 
whole task corrupted and discard the first and recover both.

The fix would be to check, for each persistent store that doesn't have a 
checkpoint, that its _specific_ store directory is also missing. Such a store 
will be restored from the changelog and we don't need to consider the task 
corrupted.

See ProcessorStateManager#initializeStoreOffsetsFromCheckpoint



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to