Xintong Song created FLINK-13184: ------------------------------------ Summary: Support launching task executors with multi-thread on YARN. Key: FLINK-13184 URL: https://issues.apache.org/jira/browse/FLINK-13184 Project: Flink Issue Type: Bug Components: Deployment / YARN Affects Versions: 1.8.1, 1.9.0 Reporter: Xintong Song Assignee: Xintong Song
Currently, YarnResourceManager starts all task executors in main thread. This could cause RM thread becomes unresponsive when launching a large number of TEs (e.g. > 1000), leading to TE registration/heartbeat timeouts. In Blink, we have a thread pool that RM starts TEs through the YARN NMClient in separated threads. I think we should add this feature to the Flink master branch as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)