[ https://issues.apache.org/jira/browse/FLINK-14340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yun Gao updated FLINK-14340: ---------------------------- Fix Version/s: 1.16.0 > Specify an unique DFSClient name for Hadoop FileSystem > ------------------------------------------------------ > > Key: FLINK-14340 > URL: https://issues.apache.org/jira/browse/FLINK-14340 > Project: Flink > Issue Type: Improvement > Components: FileSystems > Reporter: Congxian Qiu > Priority: Minor > Labels: auto-deprioritized-major > Fix For: 1.15.0, 1.16.0 > > > Currently, when Flink read/write to HDFS, we do not set the DFSClient name > for all the connections, so we can’t distinguish the connections, and can’t > find the specific Job or TM quickly. > This issue wants to add the {{container_id}} as a unique name when init > Hadoop File System, so we can easily distinguish the connections belongs to > which Job/TM. > > Core changes is add a line such as below in > {{org.apache.flink.runtime.fs.hdfs.HadoopFsFactory#create}} > > {code:java} > hadoopConfig.set(“mapreduce.task.attempt.id”, > System.getenv().getOrDefault(CONTAINER_KEY_IN_ENV, > DEFAULT_CONTAINER_ID));{code} > > Currently, In {{YarnResourceManager}} and {{MesosResourceManager}} we both > have an enviroment key {{ENV_FLINK_CONTAINER_ID = "_FLINK_CONTAINER_ID"}}, so > maybe we should introduce this key in {{StandaloneResourceManager}}. -- This message was sent by Atlassian Jira (v8.20.1#820001)