I've decided to try spark-submit ... --conf "spark.driver.extraJavaOptions=-DpropertiesFile=/home/emre/data/myModule.properties"
But when I try to retrieve the value of propertiesFile via System.err.println("propertiesFile : " + System.getProperty("propertiesFile")); I get NULL: propertiesFile : null Interestingly, when I run spark-submit with --verbose, I see that it prints: spark.driver.extraJavaOptions -> -DpropertiesFile=/home/emre/data/belga/schemavalidator.properties I couldn't understand why I couldn't get to the value of "propertiesFile" by using standard System.getProperty method. (I can use new SparkConf().get("spark.driver.extraJavaOptions") and manually parse it, and retrieve the value, but I'd like to know why I cannot retrieve that value using System.getProperty method). Any ideas? If I can achieve what I've described above properly, I plan to pass a properties file that resides on HDFS, so that it will be available to my driver program wherever that program runs. -- Emre On Mon, Feb 16, 2015 at 4:41 PM, Charles Feduke <charles.fed...@gmail.com> wrote: > I haven't actually tried mixing non-Spark settings into the Spark > properties. Instead I package my properties into the jar and use the > Typesafe Config[1] - v1.2.1 - library (along with Ficus[2] - Scala > specific) to get at my properties: > > Properties file: src/main/resources/integration.conf > > (below $ENV might be set to either "integration" or "prod"[3]) > > ssh -t root@$HOST "/root/spark/bin/spark-shell --jars /root/$JAR_NAME \ > --conf 'config.resource=$ENV.conf' \ > --conf 'spark.executor.extraJavaOptions=-Dconfig.resource=$ENV.conf'" > > Since the properties file is packaged up with the JAR I don't have to > worry about sending the file separately to all of the slave nodes. Typesafe > Config is written in Java so it will work if you're not using Scala. (The > Typesafe Config also has the advantage of being extremely easy to integrate > with code that is using Java Properties today.) > > If you instead want to send the file separately from the JAR and you use > the Typesafe Config library, you can specify "config.file" instead of > ".resource"; though I'd point you to [3] below if you want to make your > development life easier. > > 1. https://github.com/typesafehub/config > 2. https://github.com/ceedubs/ficus > 3. > http://deploymentzone.com/2015/01/27/spark-ec2-and-easy-spark-shell-deployment/ > > > > On Mon Feb 16 2015 at 10:27:01 AM Emre Sevinc <emre.sev...@gmail.com> > wrote: > >> Hello, >> >> I'm using Spark 1.2.1 and have a module.properties file, and in it I have >> non-Spark properties, as well as Spark properties, e.g.: >> >> job.output.dir=file:///home/emre/data/mymodule/out >> >> I'm trying to pass it to spark-submit via: >> >> spark-submit --class com.myModule --master local[4] --deploy-mode >> client --verbose --properties-file /home/emre/data/mymodule.properties >> mymodule.jar >> >> And I thought I could read the value of my non-Spark property, namely, >> job.output.dir by using: >> >> SparkConf sparkConf = new SparkConf(); >> final String validatedJSONoutputDir = sparkConf.get("job.output.dir"); >> >> But it gives me an exception: >> >> Exception in thread "main" java.util.NoSuchElementException: >> job.output.dir >> >> Is it not possible to mix Spark and non-Spark properties in a single >> .properties file, then pass it via --properties-file and then get the >> values of those non-Spark properties via SparkConf? >> >> Or is there another object / method to retrieve the values for those >> non-Spark properties? >> >> >> -- >> Emre Sevinç >> > -- Emre Sevinc