Hi Lu,
longer heartbeat timeouts will have the effect that a loss of component
(e.g. a TaskManager) will take longer to be detected. This will affect the
recovery speed of your application in case of such a situation. On the
upside, longer heartbeat timeouts allow working on less reliable
Hi Lu,
Xintong has a professional analysis about TM heartbeat timeout in a
historical mail[1], please check if it could help.
[1]
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Heartbeat-of-TaskManager-timed-out-td36228.html
Best regards,
JING ZHANG
Lu Niu 于2021年6月11日周五
Hi, Flink User
Several of our applications get heartbeat timeout occasionally. there is no
GC, no OOM:
```
- realtime conversion event filter (49/120)
(16e3d3e7608eed0a30e0508eb52065fd) switched from RUNNING to FAILED on
container_e05_1599158866703_129001_01_000111 @