[ https://issues.apache.org/jira/browse/FLINK-13938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yang Wang updated FLINK-13938: ------------------------------ Description: Currently, every time we start a flink cluster, flink lib jars need to be uploaded to hdfs and then register Yarn local resource so that it could be downloaded to jobmanager and all taskmanager container. I think we could have two optimizations. # Use pre-uploaded flink binary to avoid uploading of flink system jars # By default, the LocalResourceVisibility is APPLICATION, so they will be downloaded only once and shared for all taskmanager containers of a same application in the same node. However, different applications will have to download all jars every time, including the flink-dist.jar. We could use the yarn public cache to eliminate the unnecessary jars downloading and make launching container faster. Take both FLINK-13938 and FLINK-14964 into account, this whole submission optimization feature will be done in the following steps. * Add {{yarn.provided.lib.dirs}} to configure pre-uploaded libs, which contain files that are useful for all the users of the platform(i.e. different applications). So it needs to be public readable and will be set with {{PUBLIC}} visibility for local resources. For the first version, we can have only the flink-dist, lib/, plugins/ being automatically excluded from uploading if the {{yarn.pre-uploaded.flink.path}} contains a file with the same name. This will be done in FLINK-13938. * Make all the options(including user jar, flink-dist-*.jar, libs, etc.) could support remote path. This feature allow the Flink client do not need to have a local user jar and dependencies. Combined with application mode, the deployer(i.e. Flink job management system) will have better performance. This will be done in FLINK-14964. How to use the pre-upload feature? 1. First, upload the Flink binary to the HDFS directories 2. Use {{yarn.provided.lib.dirs}} to specify the pre-uploaded libs A final submission command could be issued like following. {code:java} ./bin/flink run -m yarn-cluster -d \ -yD yarn.provided.lib.dirs=hdfs://myhdfs/flink/lib,hdfs://myhdfs/flink/plugins \ examples/streaming/WindowJoin.jar {code} How to use the remote path with application mode? {code:java} ./bin/flink run -m yarn-cluster -d \ -yD yarn.provided.lib.dirs=hdfs://myhdfs/flink/lib,hdfs://myhdfs/flink/plugins \ hdfs://myhdfs/jars/WindowJoin.jar {code} was: Currently, every time we start a flink cluster, flink lib jars need to be uploaded to hdfs and then register Yarn local resource so that it could be downloaded to jobmanager and all taskmanager container. I think we could have two optimizations. # Use pre-uploaded flink binary to avoid uploading of flink system jars # By default, the LocalResourceVisibility is APPLICATION, so they will be downloaded only once and shared for all taskmanager containers of a same application in the same node. However, different applications will have to download all jars every time, including the flink-dist.jar. We could use the yarn public cache to eliminate the unnecessary jars downloading and make launching container faster. Following the discussion in the user ML. [https://lists.apache.org/list.html?u...@flink.apache.org:lte=1M:Flink%20Conf%20%22yarn.flink-dist-jar%22%20Question] Take both FLINK-13938 and FLINK-14964 into account, this feature will be done in the following steps. * Enrich "\-yt/--yarnship" to support HDFS directory * Add a new config option to control whether to disable the flink-dist uploading(*Will be extended to support all files, including lib/plugin/user jars/dependencies/etc.*) * Enrich "\-yt/--yarnship" to specify local resource visibility. It is "APPLICATION" by default. It could be also configured to "PUBLIC", which means shared by all applications, or "PRIVATE" which means shared by a same user. (*Will be done later according to the feedback*) How to use this feature? 1. First, upload the Flink binary and user jars to the HDFS directories 2. Use "\-yt/–yarnship" to specify the pre-uploaded libs 3. Disable the automatic uploading of flink-dist via {{yarn.submission.automatic-flink-dist-upload}}: false A final submission command could be issued like following. {code:java} ./bin/flink run -m yarn-cluster -d \ -yt hdfs://myhdfs/flink/release/flink-1.11 \ -yD yarn.submission.automatic-flink-dist-upload=false \ examples/streaming/WindowJoin.jar {code} > Use pre-uploaded libs to accelerate flink submission > ---------------------------------------------------- > > Key: FLINK-13938 > URL: https://issues.apache.org/jira/browse/FLINK-13938 > Project: Flink > Issue Type: New Feature > Components: Client / Job Submission, Deployment / YARN > Reporter: Yang Wang > Assignee: Yang Wang > Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Currently, every time we start a flink cluster, flink lib jars need to be > uploaded to hdfs and then register Yarn local resource so that it could be > downloaded to jobmanager and all taskmanager container. I think we could have > two optimizations. > # Use pre-uploaded flink binary to avoid uploading of flink system jars > # By default, the LocalResourceVisibility is APPLICATION, so they will be > downloaded only once and shared for all taskmanager containers of a same > application in the same node. However, different applications will have to > download all jars every time, including the flink-dist.jar. We could use the > yarn public cache to eliminate the unnecessary jars downloading and make > launching container faster. > > Take both FLINK-13938 and FLINK-14964 into account, this whole submission > optimization feature will be done in the following steps. > * Add {{yarn.provided.lib.dirs}} to configure pre-uploaded libs, which > contain files that are useful for all the users of the platform(i.e. > different applications). So it needs to be public readable and will be set > with {{PUBLIC}} visibility for local resources. For the first version, we can > have only the flink-dist, lib/, plugins/ being automatically excluded from > uploading if the {{yarn.pre-uploaded.flink.path}} contains a file with the > same name. This will be done in FLINK-13938. > * Make all the options(including user jar, flink-dist-*.jar, libs, etc.) > could support remote path. This feature allow the Flink client do not need to > have a local user jar and dependencies. Combined with application mode, the > deployer(i.e. Flink job management system) will have better performance. This > will be done in FLINK-14964. > How to use the pre-upload feature? > 1. First, upload the Flink binary to the HDFS directories > 2. Use {{yarn.provided.lib.dirs}} to specify the pre-uploaded libs > > A final submission command could be issued like following. > {code:java} > ./bin/flink run -m yarn-cluster -d \ > -yD > yarn.provided.lib.dirs=hdfs://myhdfs/flink/lib,hdfs://myhdfs/flink/plugins \ > examples/streaming/WindowJoin.jar > {code} > > How to use the remote path with application mode? > {code:java} > ./bin/flink run -m yarn-cluster -d \ > -yD > yarn.provided.lib.dirs=hdfs://myhdfs/flink/lib,hdfs://myhdfs/flink/plugins \ > hdfs://myhdfs/jars/WindowJoin.jar > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)