[ 
https://issues.apache.org/jira/browse/KAFKA-12475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17302336#comment-17302336
 ] 

Bruno Cadonna commented on KAFKA-12475:
---------------------------------------

Another aspect that we need to consider here is that with a local state store, 
we have a state store instance per stateful processor and task. With a remote 
custom state store, I guess users would prefer to have one single remote state 
store or a remote state store per Streams client. That means, the API needs to 
accept some information to decide what part of the data in the state store 
should be deleted. Basically, that information is the task ID for which the 
data needs to be deleted.

> Kafka Streams breaks EOS with remote state stores
> -------------------------------------------------
>
>                 Key: KAFKA-12475
>                 URL: https://issues.apache.org/jira/browse/KAFKA-12475
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams
>            Reporter: A. Sophie Blee-Goldman
>            Priority: Major
>              Labels: needs-kip
>
> Currently in Kafka Streams, exactly-once semantics (EOS) require that the 
> state stores be completely erased and restored from the changelog from 
> scratch in case of an error. This erasure is implemented by closing the state 
> store and then simply wiping out the local state directory. This works fine 
> for the two store implementations provided OOTB, in-memory and rocksdb, but 
> fails when the application includes a custom StateStore based on remote 
> storage, such as Redis. In this case Streams will fail to erase any of the 
> data before reinserting data from the changelog, resulting in possible 
> duplicates and breaking the guarantee of EOS.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to