I want to execute my processes in cluster mode. As I don't know where the
driver has been executed I have to do available all the file it needs. I
undertand that they are two options. Copy all the files to all nodes of
copy them to HDFS.

My doubt is,, if I want to put all the files in HDFS, isn't it automatic
with --files and --jar parameters in the spark-submit command? or do I have
to copy to HDFS manually?

My idea is to execute something like:
spark-submit --driver-java-options
"-Dlogback.configurationFile=conf/${1}Logback.xml" \
--class com.example.Launcher --driver-class-path
lib/spark-streaming-kafka-0-10_2.11-2.0.2.jar:lib/kafka-clients-1.0.0.jar \
--files /conf/${1}Conf.json example-0.0.1-SNAPSHOT.jar conf/${1}Conf.json
I have tried to with --files hdfs://.... without copying anything to hdfs
and it doesn't work either.

Reply via email to