Thanks @Tison for starting the discussion and sorry for joining so late. Yes, I think this is a very good idea. we already tweak the flink-yarn package internally to support something similar to what @Thomas mentioned: to support registering a Jar that has already uploaded to some DFS (needless to be the Yarn public cache discussed in FLINK-13938). The reason is that: we provide our internal packaged extension libraries for our customers. And we've seen good performance improvement in our YARN cluster during container localization phase after our customer switch to use pre-uploaded JARs instead of having to upload every time during deployment.
Looking forward for this feature! -- Rong On Tue, Nov 19, 2019 at 10:19 PM tison <wander4...@gmail.com> wrote: > Thanks for your participation! > > @Yang: Great to hear. I'd like to know whether or not a remote flink jar > path conflicts with FLINK-13938. IIRC FLINK-13938 auto excludes local > flink jar from shipping which possibly not works for the remote one. > > @Thomas: It inspires a lot URL becomes the unified representation of > resource. I'm thinking of how to serve a unique process getting resource > from URL which points to an artifact or distributed file system. > > @ouywl & Stephan: Yes this improvement can be migrated to environment like > k8s, IIRC the k8s proposal already discussed about improvement using "init > container" and other technologies. However, so far I regard it is an > improvement different from one storage to another so that we achieve then > individually. > > > Best, > tison. > > > Stephan Ewen <se...@apache.org> 于2019年11月20日周三 上午12:34写道: > >> Would that be a feature specific to Yarn? (and maybe standalone sessions) >> >> For containerized setups, and init container seems like a nice way to >> solve this. Also more flexible, when it comes to supporting authentication >> mechanisms for the target storage system, etc. >> >> On Tue, Nov 19, 2019 at 5:29 PM ouywl <ou...@139.com> wrote: >> >>> I have implemented this feature in our env, Use ‘Init Container’ of >>> docker to get URL of a jar file ,It seems a good idea. >>> >>> ouywl >>> ou...@139.com >>> >>> <https://maas.mail.163.com/dashi-web-extend/html/proSignature.html?ftlId=1&name=ouywl&uid=ouywl%40139.com&iconUrl=https%3A%2F%2Fmail-online.nosdn.127.net%2Fsma8dc7719018ba2517da7111b3db5a170.jpg&items=%5B%22ouywl%40139.com%22%5D> >>> 签名由 网易邮箱大师 <https://mail.163.com/dashi/dlpro.html?from=mail81> 定制 >>> >>> On 11/19/2019 12:11,Thomas Weise<t...@apache.org> <t...@apache.org> >>> wrote: >>> >>> There is a related use case (not specific to HDFS) that I came across: >>> >>> It would be nice if the jar upload endpoint could accept the URL of a >>> jar file as alternative to the jar file itself. Such URL could point to an >>> artifactory or distributed file system. >>> >>> Thomas >>> >>> >>> On Mon, Nov 18, 2019 at 7:40 PM Yang Wang <danrtsey...@gmail.com> wrote: >>> >>>> Hi tison, >>>> >>>> Thanks for your starting this discussion. >>>> * For user customized flink-dist jar, it is an useful feature. Since it >>>> could avoid to upload the flink-dist jar >>>> every time. Especially in production environment, it could accelerate >>>> the >>>> submission process. >>>> * For the standard flink-dist jar, FLINK-13938[1] could solve >>>> the problem.Upload a official flink release >>>> binary to distributed storage(hdfs) first, and then all the submission >>>> could benefit from it. Users could >>>> also upload the customized flink-dist jar to accelerate their >>>> submission. >>>> >>>> If the flink-dist jar could be specified to a remote path, maybe the >>>> user >>>> jar have the same situation. >>>> >>>> [1]. https://issues.apache.org/jira/browse/FLINK-13938 >>>> >>>> tison <wander4...@gmail.com> 于2019年11月19日周二 上午11:17写道: >>>> >>>> > Hi forks, >>>> > >>>> > Recently, our customers ask for a feature configuring remote flink >>>> jar. >>>> > I'd like to reach to you guys >>>> > to see whether or not it is a general need. >>>> > >>>> > ATM Flink only supports configures local file as flink jar via `-yj` >>>> > option. If we pass a HDFS file >>>> > path, due to implementation detail it will fail with >>>> > IllegalArgumentException. In the story we support >>>> > configure remote flink jar, this limitation is eliminated. We also >>>> make >>>> > use of YARN locality so that >>>> > reducing uploading overhead, instead, asking YARN to localize the jar >>>> on >>>> > AM container started. >>>> > >>>> > Besides, it possibly has overlap with FLINK-13938. I'd like to put the >>>> > discussion on our >>>> > mailing list first. >>>> > >>>> > Are you looking forward to such a feature? >>>> > >>>> > @Yang Wang: this feature is different from that we discussed offline, >>>> it >>>> > only focuses on flink jar, not >>>> > all ship files. >>>> > >>>> > Best, >>>> > tison. >>>> > >>>> >>>