Hi, I know about that approach.
I don't want to run mess of classes from single jar, I want to utilize
distributed cache functionality and ship application jar and dependent jars
explicitly.
--deploy-mode client unfortunately copies and distributes all jars
repeatedly for every spark job started
spark-submit --conf "spark.driver.userClassPathFirst=true" --class
com.MyClass --master yarn --deploy-mode client --jars
hdfs:///my-lib.jar,hdfs:///my-seocnd-lib.jar jar-wth-com-MyClass.jar
job_params
2016-05-17 15:41 GMT+02:00 Serega Sheypak :
>
Hi Serega,
Create a jar including all the the dependencies and execute it like below
through shell script
/usr/local/spark/bin/spark-submit \ //location of your spark-submit
--class classname \ //location of your main classname
--master yarn \
--deploy-mode cluster \
https://issues.apache.org/jira/browse/SPARK-10643
Looks like it's the reason...
2016-05-17 15:31 GMT+02:00 Serega Sheypak :
> No, and it looks like a problem.
>
> 2.2. --master yarn --deploy-mode client
> means:
> 1. submit spark as yarn app, but spark-driver is
No, and it looks like a problem.
2.2. --master yarn --deploy-mode client
means:
1. submit spark as yarn app, but spark-driver is started on local machine.
2. A upload all dependent jars to HDFS and specify jar HDFS paths in --jars
arg.
3. Driver runs my Spark Application main class named
Do you put your app jar on hdfs ? The app jar must be on your local
machine.
On Tue, May 17, 2016 at 8:33 PM, Serega Sheypak
wrote:
> hi, I'm trying to:
> 1. upload my app jar files to HDFS
> 2. run spark-submit with:
> 2.1. --master yarn --deploy-mode cluster
> or
>
hi, I'm trying to:
1. upload my app jar files to HDFS
2. run spark-submit with:
2.1. --master yarn --deploy-mode cluster
or
2.2. --master yarn --deploy-mode client
specifying --jars hdfs:///my/home/commons.jar,hdfs:///my/home/super.jar
When spark job is submitted, SparkSubmit client outputs: