spark-submit config via file

, Roy Fri, 24 Mar 2017 04:38:53 -0700

Hi,

I am trying to deploy spark job by using spark-submit which has bunch of
parameters like


spark-submit --class StreamingEventWriterDriver --master yarn --deploy-mode
cluster --executor-memory 3072m --executor-cores 4 --files streaming.conf
spark_streaming_2.11-assembly-1.0-SNAPSHOT.jar -conf "streaming.conf"

I was looking a way to put all these flags in the file to pass to
spark-submit to make my spark-submitcommand simple like this

spark-submit --class StreamingEventWriterDriver --master yarn --deploy-mode
cluster --properties-file properties.conf --files streaming.conf
spark_streaming_2.11-assembly-1.0-SNAPSHOT.jar -conf "streaming.conf"

properties.conf has following contents


spark.executor.memory 3072m

spark.executor.cores 4


But I am getting following error


17/03/24 11:36:26 INFO Client: Use hdfs cache file as spark.yarn.archive
for HDP,
hdfsCacheFile:hdfs:///hdp/apps/2.6.0.0-403/spark2/spark2-hdp-yarn-archive.tar.gz

17/03/24 11:36:26 WARN AzureFileSystemThreadPoolExecutor: Disabling threads
for Delete operation as thread count 0 is <= 1

17/03/24 11:36:26 INFO AzureFileSystemThreadPoolExecutor: Time taken for
Delete operation is: 1 ms with threads: 0

17/03/24 11:36:27 INFO Client: Deleted staging directory wasb://
a...@abc.blob.core.windows.net/user/sshuser/.sparkStaging/application_1488402758319_0492

Exception in thread "main" java.io.IOException: Incomplete HDFS URI, no
host: hdfs:///hdp/apps/2.6.0.0-403/spark2/spark2-hdp-yarn-archive.tar.gz

        at
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:154)

        at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2791)

        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:99)

        at
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2825)

        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2807)

        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:386)

        at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)

        at
org.apache.spark.deploy.yarn.Client.copyFileToRemote(Client.scala:364)

        at org.apache.spark.deploy.yarn.Client.org
$apache$spark$deploy$yarn$Client$$distribute$1(Client.scala:480)

        at
org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:552)

        at
org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:881)

        at
org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:170)

        at org.apache.spark.deploy.yarn.Client.run(Client.scala:1218)

        at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1277)

        at org.apache.spark.deploy.yarn.Client.main(Client.scala)

        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

        at java.lang.reflect.Method.invoke(Method.java:498)

        at
org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:745)

        at
org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)

        at
org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)

        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)

        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

17/03/24 11:36:27 INFO MetricsSystemImpl: Stopping azure-file-system
metrics system...

Anyone know is this is even possible ?


Thanks...

Roy

spark-submit config via file

Reply via email to