Re: org.apache.zookeeper.ClientCnxn, Client session timed out

Hao Sun Wed, 27 Dec 2017 15:12:06 -0800

Thanks! Great to know I do not have to worry duplicates inside Flink.

One more question, why this happens? Because TM and JM both check
leadership in different interval?
> The TM noticed the loss of leadership before the JM did.


On Wed, Dec 27, 2017, 13:52 Ufuk Celebi <u...@apache.org> wrote:

> On Wed, Dec 27, 2017 at 4:41 PM, Hao Sun <ha...@zendesk.com> wrote:
>
>> Somehow TM detected JM leadership loss from ZK and self disconnected?
>> And couple of seconds later, JM failed to connect to ZK?
>>
>
> Yes, exactly as you describe. The TM noticed the loss of leadership before
> the JM did.
>
>
>> After all the cluster recovered nicely by its own, but I am wondering
>> does this break the exactly-once semantics? If yes, what should I take care?
>>
>
> Great :-) It does not break exactly-once guarantees *within* the Flink
> pipeline as the state of the latest completed checkpoint will be restored
> after recovery. This rewinds your job and might result in duplicate or
> changed output if you don't use an exactly once or idempotent sink.
>
> – Ufuk
>
>

Re: org.apache.zookeeper.ClientCnxn, Client session timed out

Reply via email to