[ 
https://issues.apache.org/jira/browse/FLINK-19568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song reassigned FLINK-19568:
------------------------------------

    Assignee: Xintong Song

> Offload creating TM launch contexts to the IO executor
> ------------------------------------------------------
>
>                 Key: FLINK-19568
>                 URL: https://issues.apache.org/jira/browse/FLINK-19568
>             Project: Flink
>          Issue Type: Improvement
>          Components: Deployment / YARN
>            Reporter: Xintong Song
>            Assignee: Xintong Song
>            Priority: Major
>             Fix For: 1.12.0
>
>
> Currently, for launching each TM container on Yarn, Flink creates a container 
> launch context in RM's PRC main thread. This includes accessing file status 
> from remote file systems, which may blocks the RM's main thread, especially 
> when remote file system is slow. See [this 
> thread|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/TM-heartbeat-timeout-due-to-ResourceManager-being-busy-td38626.html].
> The creating of TM context does not access nor change any RM's internal 
> states. Therefore, I propose to offload the work to the IO executor. To be 
> specific, I think the entire 
> {{YarnResourceManagerDriver#createTaskExecutorLaunchContext}} can be invoked 
> on the IO executor.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to