[ https://issues.apache.org/jira/browse/FLINK-19568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xintong Song reassigned FLINK-19568: ------------------------------------ Assignee: Xintong Song > Offload creating TM launch contexts to the IO executor > ------------------------------------------------------ > > Key: FLINK-19568 > URL: https://issues.apache.org/jira/browse/FLINK-19568 > Project: Flink > Issue Type: Improvement > Components: Deployment / YARN > Reporter: Xintong Song > Assignee: Xintong Song > Priority: Major > Fix For: 1.12.0 > > > Currently, for launching each TM container on Yarn, Flink creates a container > launch context in RM's PRC main thread. This includes accessing file status > from remote file systems, which may blocks the RM's main thread, especially > when remote file system is slow. See [this > thread|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/TM-heartbeat-timeout-due-to-ResourceManager-being-busy-td38626.html]. > The creating of TM context does not access nor change any RM's internal > states. Therefore, I propose to offload the work to the IO executor. To be > specific, I think the entire > {{YarnResourceManagerDriver#createTaskExecutorLaunchContext}} can be invoked > on the IO executor. -- This message was sent by Atlassian Jira (v8.3.4#803005)