Re: How to achieve exactly once on node failure using Kafka

2017-02-24 Thread Y. Sakamoto
Hi Robert, It became clear to me. Thanks! Regards, Yuichiro On 2017/02/24 1:08, Robert Metzger wrote: Hi, exactly. You have to make sure that you can write data for the same ID multiple times. Exactly once in Flink is only guaranteed for registered state. So if you have a flatMap() with a

Re: How to achieve exactly once on node failure using Kafka

2017-02-23 Thread Robert Metzger
Hi, exactly. You have to make sure that you can write data for the same ID multiple times. Exactly once in Flink is only guaranteed for registered state. So if you have a flatMap() with a "counter" variable, that is held in a "ValueState", this counter will always be in sync with the number of

Re: How to achieve exactly once on node failure using Kafka

2017-02-21 Thread Y. Sakamoto
Thank you for your reply. Under my understanding, Map / Filter Function operate with "at least once" when a failure occurs, and it is necessary to code that it will be saved (overwritten) in Elasticsearch with the same ID even if double data comes. Is it correct? (sorry, I cannot understand

Re: How to achieve exactly once on node failure using Kafka

2017-02-20 Thread Stephan Ewen
Hi! Exactly-once end-to-end requires sinks that support that kind of behavior (typically some form of transactions support). Kafka currently does not have the mechanisms in place to support exactly-once sinks, but the Kafka project is working on that feature. For ElasticSearch, it is also not

How to achieve exactly once on node failure using Kafka

2017-02-20 Thread Y. Sakamoto
Hi, I'm using Flink 1.2.0 and try to do "exactly once" data transfer from Kafka to Elasticsearch, but I cannot. (Scala 2.11, Kafka 0.10, without YARN) There are 2 Flink TaskManager nodes, and when processing with 2 parallelism, shutdown one of them (simulating node failure). Using