I haven't actually tried mixing non-Spark settings into the Spark
properties. Instead I package my properties into the jar and use the
Typesafe Config[1] - v1.2.1 - library (along with Ficus[2] - Scala
specific) to get at my properties:

Properties file: src/main/resources/integration.conf

(below $ENV might be set to either "integration" or "prod"[3])

ssh -t root@$HOST "/root/spark/bin/spark-shell --jars /root/$JAR_NAME \
    --conf 'config.resource=$ENV.conf' \
    --conf 'spark.executor.extraJavaOptions=-Dconfig.resource=$ENV.conf'"

Since the properties file is packaged up with the JAR I don't have to worry
about sending the file separately to all of the slave nodes. Typesafe
Config is written in Java so it will work if you're not using Scala. (The
Typesafe Config also has the advantage of being extremely easy to integrate
with code that is using Java Properties today.)

If you instead want to send the file separately from the JAR and you use
the Typesafe Config library, you can specify "config.file" instead of
".resource"; though I'd point you to [3] below if you want to make your
development life easier.

1. https://github.com/typesafehub/config
2. https://github.com/ceedubs/ficus
3.
http://deploymentzone.com/2015/01/27/spark-ec2-and-easy-spark-shell-deployment/



On Mon Feb 16 2015 at 10:27:01 AM Emre Sevinc <emre.sev...@gmail.com> wrote:

> Hello,
>
> I'm using Spark 1.2.1 and have a module.properties file, and in it I have
> non-Spark properties, as well as Spark properties, e.g.:
>
>    job.output.dir=file:///home/emre/data/mymodule/out
>
> I'm trying to pass it to spark-submit via:
>
>    spark-submit --class com.myModule --master local[4] --deploy-mode
> client --verbose --properties-file /home/emre/data/mymodule.properties
> mymodule.jar
>
> And I thought I could read the value of my non-Spark property, namely,
> job.output.dir by using:
>
>     SparkConf sparkConf = new SparkConf();
>     final String validatedJSONoutputDir = sparkConf.get("job.output.dir");
>
> But it gives me an exception:
>
>     Exception in thread "main" java.util.NoSuchElementException:
> job.output.dir
>
> Is it not possible to mix Spark and non-Spark properties in a single
> .properties file, then pass it via --properties-file and then get the
> values of those non-Spark properties via SparkConf?
>
> Or is there another object / method to retrieve the values for those
> non-Spark properties?
>
>
> --
> Emre Sevinç
>

Reply via email to