Till Rohrmann created FLINK-9417:
------------------------------------
Summary: Send heartbeat requests from RPC endpoint's main thread
Key: FLINK-9417
URL: https://issues.apache.org/jira/browse/FLINK-9417
Project: Flink
Issue Type: Improvement
Components: Distributed Coordination
Affects Versions: 1.5.0, 1.6.0
Reporter: Till Rohrmann
Currently, we use the {{RpcService#scheduledExecutor}} to send heartbeat
requests to remote targets. This has the problem that we still see heartbeats
from this endpoint also if its main thread is currently blocked. Due to this,
the heartbeat response cannot be processed and the remote target times out. On
the remote side, this won't be noticed because it still receives the heartbeat
requests.
A solution to this problem would be to send the heartbeat requests to the
remote thread through the RPC endpoint's main thread. That way, also the
heartbeats would be blocked if the main thread is blocked/busy.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)