So a bit more investigation, shows that:

if I have configured spark-defaults.conf with:

"spark.files          library.py"

then if I call

"spark-submit.py -v test.py"

I see that my "spark.files" default option has been replaced with
"spark.files      test.py",  basically spark-submit is overwriting
spark.files with the name of the script.

Is this a bug or is there another way to add default libraries without
having to specify them on the command line?

Thanks,

-Axel



On Wed, Sep 2, 2015 at 10:34 PM, Davies Liu <dav...@databricks.com> wrote:

> This should be a bug, could you create a JIRA for it?
>
> On Wed, Sep 2, 2015 at 4:38 PM, Axel Dahl <a...@whisperstream.com> wrote:
> > in my spark-defaults.conf I have:
> > spark.files               file1.zip, file2.py
> > spark.master           spark://master.domain.com:7077
> >
> > If I execute:
> > bin/pyspark
> >
> > I can see it adding the files correctly.
> >
> > However if I execute
> >
> > bin/spark-submit test.py
> >
> > where test.py relies on the file1.zip, I get and error.
> >
> > If I i instead execute
> >
> > bin/spark-submit --py-files file1.zip test.py
> >
> > It works as expected.
> >
> > How do I get spark-submit to import the spark-defaults.conf file or what
> > should I start checking to figure out why one works and the other
> doesn't?
> >
> > Thanks,
> >
> > -Axel
>

Reply via email to