Re: spark-submit config via file

Yong Zhang Fri, 24 Mar 2017 06:19:25 -0700

Of course it is possible.


You can always to set any configurations in your application using API, instead 
of pass in through the CLI.


val sparkConf = new 
SparkConf().setAppName(properties.get("appName")).set("master", 
properties.get("master")).set(xxx, properties.get("xxx"))

Your error is your environment problem.

Yong
________________________________
From: , Roy <rp...@njit.edu>
Sent: Friday, March 24, 2017 7:38 AM
To: user
Subject: spark-submit config via file

Hi,

I am trying to deploy spark job by using spark-submit which has bunch of 
parameters like

spark-submit --class StreamingEventWriterDriver --master yarn --deploy-mode 
cluster --executor-memory 3072m --executor-cores 4 --files streaming.conf 
spark_streaming_2.11-assembly-1.0-SNAPSHOT.jar -conf "streaming.conf"

I was looking a way to put all these flags in the file to pass to spark-submit 
to make my spark-submitcommand simple like this

spark-submit --class StreamingEventWriterDriver --master yarn --deploy-mode 
cluster --properties-file properties.conf --files streaming.conf 
spark_streaming_2.11-assembly-1.0-SNAPSHOT.jar -conf "streaming.conf"

properties.conf has following contents


spark.executor.memory 3072m

spark.executor.cores 4


But I am getting following error


17/03/24 11:36:26 INFO Client: Use hdfs cache file as spark.yarn.archive for 
HDP, 
hdfsCacheFile:hdfs:///hdp/apps/2.6.0.0-403/spark2/spark2-hdp-yarn-archive.tar.gz

17/03/24 11:36:26 WARN AzureFileSystemThreadPoolExecutor: Disabling threads for 
Delete operation as thread count 0 is <= 1

17/03/24 11:36:26 INFO AzureFileSystemThreadPoolExecutor: Time taken for Delete 
operation is: 1 ms with threads: 0

17/03/24 11:36:27 INFO Client: Deleted staging directory 
wasb://a...@abc.blob.core.windows.net/user/sshuser/.sparkStaging/application_1488402758319_0492<http://a...@abc.blob.core.windows.net/user/sshuser/.sparkStaging/application_1488402758319_0492>

Exception in thread "main" java.io.IOException: Incomplete HDFS URI, no host: 
hdfs:///hdp/apps/2.6.0.0-403/spark2/spark2-hdp-yarn-archive.tar.gz

        at 
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:154)

        at 
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2791)

        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:99)

        at 
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2825)

        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2807)

        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:386)

        at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)

        at 
org.apache.spark.deploy.yarn.Client.copyFileToRemote(Client.scala:364)

        at 
org.apache.spark.deploy.yarn.Client.org<http://org.apache.spark.deploy.yarn.Client.org>$apache$spark$deploy$yarn$Client$$distribute$1(Client.scala:480)

        at 
org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:552)

        at 
org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:881)

        at 
org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:170)

        at org.apache.spark.deploy.yarn.Client.run(Client.scala:1218)

        at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1277)

        at org.apache.spark.deploy.yarn.Client.main(Client.scala)

        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

        at java.lang.reflect.Method.invoke(Method.java:498)

        at 
org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:745)

        at 
org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)

        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)

        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)

        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

17/03/24 11:36:27 INFO MetricsSystemImpl: Stopping azure-file-system metrics 
system...

Anyone know is this is even possible ?


Thanks...

Roy

Re: spark-submit config via file

Reply via email to