I have been trying to work around a similar problem with my Typesafe config *.conf files seemingly not appearing on the executors. (Though now that I think about it its not because the files are absent in the JAR, but because the -Dconf.resource environment variable I pass to the master obviously doesn't get relayed to the workers.)
What happens if you do something like this: nohup ./bin/spark-submit --verbose —jars lib/app.jar \ --master spark://master-amazonaws.com:7077 \ --class com.elsevier.spark.SparkSync \ --conf "spark.executor.extraJavaOptions=-Ds3service.server-side-encryption=AES256" lib/app.jar > out.log & (I bet this will fix my problem too.) On Wed Jan 28 2015 at 10:17:09 AM Kohler, Curt E (ELS-STL) < c.koh...@elsevier.com> wrote: > So, following up on your suggestion, I'm still having some problems > getting the configuration changes recognized when my job run. > > > I’ve added jets3t.properties to the root of my application jar file that > I > submit to Spark (via spark-submit). > > I’ve verified that my jets3t.properties is at the root of my application > jar by executing jar tf app.jar. > > I submit my job to the cluster with the following command. > > nohup ./bin/spark-submit --verbose —jars lib/app.jar --master > spark://master-amazonaws.com:7077 --class com.elsevier.spark.SparkSync > lib/app.jar > out.log & > > > > In my mainline of app.jar, I also added the following code: > > > log.info(System.getProperty("java.class.path")); > InputStream in = > SparkSync.class.getClassLoader().getResourceAsStream("jets3t.properties"); > log.info(getStringFromInputStream(in)); > > And I can see that the jets3t.properties I provided is found because it > outputs: > > s3service.server-side-encryption=AES256 > > It’s almost as if the hadoop/jets3t piece has already been initialized and > is ignoring my jets3t.properties. > > I can get this all working inside of Eclipse by including the folder > containing my jets3t.properties. But, I can’t get things working when > trying to submit this to a spark stand-alone cluster. > > Any insights would be appreciated. > ------------------------------ > *From:* Thomas Demoor <thomas.dem...@amplidata.com> > *Sent:* Tuesday, January 27, 2015 4:41 AM > *To:* Kohler, Curt E (ELS-STL) > *Cc:* user@spark.apache.org > *Subject:* Re: Spark and S3 server side encryption > > Spark uses the Hadoop filesystems. > > I assume you are trying to use s3n:// which, under the hood, uses the > 3rd party jets3t library. It is configured through the jets3t.properties > file (google "hadoop s3n jets3t") which you should put on Spark's > classpath. The setting you are looking for > is s3service.server-side-encryption > > The last version of hadoop (2.6) introduces a new and improved s3a:// > filesystem which has the official sdk from Amazon under the hood. > > > On Mon, Jan 26, 2015 at 10:01 PM, curtkohler <c.koh...@elsevier.com> > wrote: > >> We are trying to create a Spark job that writes out a file to S3 that >> leverage S3's server side encryption for sensitive data. Typically this is >> accomplished by setting the appropriate header on the put request, but it >> isn't clear whether this capability is exposed in the Spark/Hadoop APIs. >> Does anyone have any suggestions? >> >> >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-and-S3-server-side-encryption-tp21377.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> >> >