Hi Abdul, as Biao said the `--classpath` option should only be used if you want to make dependencies available which are not included in the submitted user code jar. E.g. if you have installed a large library which is too costly to ship every time you submit a job. Usually, you would not need to specify this option if you build an uber jar.
Cheers, Till On Tue, Jun 18, 2019 at 7:23 AM Biao Liu <mmyy1...@gmail.com> wrote: > Ah, sorry for misunderstanding. > So what you are asking is that why we need "--classpath"? I'm not sure > what the original author think of it. I guess the listed below might be > considered. > 1. Avoid duplicated deploying. If some common jars are deployed in advance > to each node of cluster, the jobs depend on these jars could avoid > deploying one by one. > 2. Support NFS which is mentioned in option description of "--classpath". > > > Abdul Qadeer <quadeer....@gmail.com> 于2019年6月18日周二 上午11:45写道: > >> Hi Biao, >> >> I am aware of it - that's not my question. >> >> On Mon, Jun 17, 2019 at 7:42 PM Biao Liu <mmyy1...@gmail.com> wrote: >> >>> Hi Abdul, "--classpath <url>" can be used for those are not included in >>> user jar. If all your classes are included in your jar passed to Flink, you >>> don't need this "--classpath". >>> >>> Abdul Qadeer <quadeer....@gmail.com> 于2019年6月18日周二 上午3:08写道: >>> >>>> Hi! >>>> >>>> I was going through submission of a Flink program through CLI. I see >>>> that "--classpath <url>" needs to be accessible from all nodes in the >>>> cluster as per documentation. As I understand the jar files are already >>>> part of the blob uploaded to JobManager from the CLI. The TaskManagers can >>>> download this blob when the receive the task and access the classes from >>>> there. Why is there a need to be able to access these files from every node >>>> then? It makes sense to use Distributed File System to access these jars if >>>> the network is not reachable to download blob files. Or if the blob doesn't >>>> contain metadata to differentiate between child class loader classes and >>>> the rest. However it seems like the TaskManager always tries to access the >>>> specified class paths irrespective of Network Partitions. >>>> >>>>