[
https://issues.apache.org/jira/browse/KAFKA-12475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17303700#comment-17303700
]
Guozhang Wang commented on KAFKA-12475:
---------------------------------------
+1 to [~cadonna]. I've seen some people using remote stores specifically for
purposes like consolidating state stores, or even get around the repartitioning
for example. So I think this is quite a common theme for remote stores. From
the API perspective, though, I think it can still be supported with a
`deleteAll()` as long as users have done some bookkeeping during the
`store.init(ProcessorContext..)` so that they know API calls from this instance
are related to certain task or even app.
> Kafka Streams breaks EOS with remote state stores
> -------------------------------------------------
>
> Key: KAFKA-12475
> URL: https://issues.apache.org/jira/browse/KAFKA-12475
> Project: Kafka
> Issue Type: Bug
> Components: streams
> Reporter: A. Sophie Blee-Goldman
> Priority: Major
> Labels: needs-kip
>
> Currently in Kafka Streams, exactly-once semantics (EOS) require that the
> state stores be completely erased and restored from the changelog from
> scratch in case of an error. This erasure is implemented by closing the state
> store and then simply wiping out the local state directory. This works fine
> for the two store implementations provided OOTB, in-memory and rocksdb, but
> fails when the application includes a custom StateStore based on remote
> storage, such as Redis. In this case Streams will fail to erase any of the
> data before reinserting data from the changelog, resulting in possible
> duplicates and breaking the guarantee of EOS.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)