[
https://issues.apache.org/jira/browse/FLINK-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15433305#comment-15433305
]
Till Rohrmann commented on FLINK-4449:
--------------------------------------
Can we create a generic {{HeartbeatManager}} which can be used for the
heartbeats between RM <=> TM, RM <=> JM and JM <=> TM? I think it should be
possible similar to the {{RetryingRegistration}}. I think we should create a
dedicated issue for the implementation. There we should also flesh out a little
bit more the details of the implementation. Like shall the heartbeat be
delivered as the result of a future or shall the sending side be also an rpc
endpoint which is told about the heartbeat response via a tell operation.
> Heartbeat Manager between ResourceManager and TaskExecutor
> ----------------------------------------------------------
>
> Key: FLINK-4449
> URL: https://issues.apache.org/jira/browse/FLINK-4449
> Project: Flink
> Issue Type: Sub-task
> Components: Cluster Management
> Reporter: zhangjing
> Assignee: zhangjing
>
> HeartbeatManager is responsible for heartbeat between resourceManager to
> TaskExecutor
> 1. Register taskExecutors
> register heartbeat targets. If the heartbeat response for these targets is
> not reported in time, mark target failed and notify resourceManager
> 2. trigger heartbeat
> trigger heartbeat from resourceManager to TaskExecutor periodically
> taskExecutor report slot allocation in the heartbeat response
> ResourceManager sync self slot allocation with the heartbeat response
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)