[ 
https://issues.apache.org/jira/browse/KAFKA-12475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17303700#comment-17303700
 ] 

Guozhang Wang commented on KAFKA-12475:
---------------------------------------

+1 to [~cadonna]. I've seen some people using remote stores specifically for 
purposes like consolidating state stores, or even get around the repartitioning 
for example. So I think this is quite a common theme for remote stores. From 
the API perspective, though, I think it can still be supported with a 
`deleteAll()` as long as users have done some bookkeeping during the 
`store.init(ProcessorContext..)` so that they know API calls from this instance 
are related to certain task or even app.

> Kafka Streams breaks EOS with remote state stores
> -------------------------------------------------
>
>                 Key: KAFKA-12475
>                 URL: https://issues.apache.org/jira/browse/KAFKA-12475
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams
>            Reporter: A. Sophie Blee-Goldman
>            Priority: Major
>              Labels: needs-kip
>
> Currently in Kafka Streams, exactly-once semantics (EOS) require that the 
> state stores be completely erased and restored from the changelog from 
> scratch in case of an error. This erasure is implemented by closing the state 
> store and then simply wiping out the local state directory. This works fine 
> for the two store implementations provided OOTB, in-memory and rocksdb, but 
> fails when the application includes a custom StateStore based on remote 
> storage, such as Redis. In this case Streams will fail to erase any of the 
> data before reinserting data from the changelog, resulting in possible 
> duplicates and breaking the guarantee of EOS.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to