John Roesler created KAFKA-10260: ------------------------------------ Summary: Streams could recover stores in a task independently Key: KAFKA-10260 URL: https://issues.apache.org/jira/browse/KAFKA-10260 Project: Kafka Issue Type: Improvement Components: streams Reporter: John Roesler
Currently, ProcessorStateManager checks for corrupted tasks by checking, for each persistent store, if its checkpoint is missing, then the task directory must also be empty. This is a little overzealous, since we aren't checking whether the store's specific directory is nonempty, only if there are any directories for any stores. So if there are two stores in a task, and one is correctly written and checkpointed, while the other is neither written nor checkpointed, we _could_ correctly load the first and recover the second but instead we'll consider the whole task corrupted and discard the first and recover both. The fix would be to check, for each persistent store that doesn't have a checkpoint, that its _specific_ store directory is also missing. Such a store will be restored from the changelog and we don't need to consider the task corrupted. See ProcessorStateManager#initializeStoreOffsetsFromCheckpoint -- This message was sent by Atlassian Jira (v8.3.4#803005)