[ https://issues.apache.org/jira/browse/FLINK-4806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15568215#comment-15568215 ]
Maximilian Michels commented on FLINK-4806: ------------------------------------------- Thank you for your comments [~ykt836]. I agree that it would be nice to simplify this part of the ResourceManager. I think what is really important is that we keep an up-to-date view of the leadership information. Otherwise, stale JobMasters could send requests to the ResourceManager that cause it to make unnecessary actions. The approach you suggested would eventually pick up the new leader but it would let old leaders control the ResourceManager as long as the new one has not connected. I have to make up my mind if that could actually be a problem or whether eventual consistency would be enough. > ResourceManager stop listening JobManager's leader address > ---------------------------------------------------------- > > Key: FLINK-4806 > URL: https://issues.apache.org/jira/browse/FLINK-4806 > Project: Flink > Issue Type: Sub-task > Components: Cluster Management > Reporter: Kurt Young > > Currently in flip-6 branch, when RM receives a registration from JM, it will > verify the leader session id of JM and attach a JobManagerLeaderListener with > it for monitoring the future changes. > Maybe we can simplify it a little bit. We don't monitor the leadership change > of the JM, after the verification passed when JM registered itself, we simply > write down the leader id of the registered the JM for future rpc filtering, > and start heartbeat monitor with JM. > If JM's leadership has been changed, the new JM will register itself, and RM > will verify its leadership when received registration, and RM can decide > whether accept or reject the registration. It's kind of like JM's information > in RM is preempted only by new JM but not by RM itself with leadership change > listener. By doing this, we can simplify the logic inside RM and don't have > to do any error handling with leader listener. -- This message was sent by Atlassian JIRA (v6.3.4#6332)