[ https://issues.apache.org/jira/browse/FLINK-14112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16933022#comment-16933022 ]
TisonKun commented on FLINK-14112: ---------------------------------- I agree with [~trohrmann]'s comments. Another question I notice is that for what reason we notify a "null" address/session-id? I think the timeout logic can be handled by heartbeats and if we enforce the notification always contains valid leader info we can reduce noisy & meaningless log also simplify logic in {{LeaderRetrievalListener}} > Removing zookeeper state should cause the task manager and job managers to > restart > ---------------------------------------------------------------------------------- > > Key: FLINK-14112 > URL: https://issues.apache.org/jira/browse/FLINK-14112 > Project: Flink > Issue Type: Wish > Components: Runtime / Coordination > Affects Versions: 1.8.1, 1.9.0 > Reporter: Aaron Levin > Priority: Minor > > Suppose you have a flink application running on a cluster with the following > configuration: > {noformat} > high-availability.zookeeper.path.root: /flink > {noformat} > Now suppose you delete all the znodes within {{/flink}}. I experienced the > following: > * massive amount of logging > * application did not restart > * task manager did not crash or restart > * job manager did not crash or restart > From this state I had to restart all the task managers and all the job > managers in order for the flink application to recover. > It would be desirable for the Task Managers and Job Managers to crash if the > znode is not available (though perhaps you all have thought about this more > deeply than I!) -- This message was sent by Atlassian Jira (v8.3.4#803005)