Thanks for investigating this. The right place to add these is the core-site.xml template we have at https://github.com/amplab/spark-ec2/blob/branch-1.5/templates/root/spark/conf/core-site.xml and/or https://github.com/amplab/spark-ec2/blob/branch-1.5/templates/root/ephemeral-hdfs/conf/core-site.xml
Feel free to open a PR against the amplab/spark-ec2 repository for this. Thanks Shivaram On Thu, Nov 5, 2015 at 8:25 AM, Christian <engr...@gmail.com> wrote: > We ended up reading and writing to S3 a ton in our Spark jobs. > For this to work, we ended up having to add s3a, and s3 key/secret pairs. We > also had to add fs.hdfs.impl to get these things to work. > > I thought maybe I'd share what we did and it might be worth adding these to > the spark conf for out of the box functionality with S3. > > We created: > ec2/deploy.generic/root/spark-ec2/templates/root/spark/conf/core-site.xml > > We changed the contents form the original, adding in the following: > > <property> > <name>fs.file.impl</name> > <value>org.apache.hadoop.fs.LocalFileSystem</value> > </property> > > <property> > <name>fs.hdfs.impl</name> > <value>org.apache.hadoop.hdfs.DistributedFileSystem</value> > </property> > > <property> > <name>fs.s3.impl</name> > <value>org.apache.hadoop.fs.s3native.NativeS3FileSystem</value> > </property> > > <property> > <name>fs.s3.awsAccessKeyId</name> > <value>{{aws_access_key_id}}</value> > </property> > > <property> > <name>fs.s3.awsSecretAccessKey</name> > <value>{{aws_secret_access_key}}</value> > </property> > > <property> > <name>fs.s3n.awsAccessKeyId</name> > <value>{{aws_access_key_id}}</value> > </property> > > <property> > <name>fs.s3n.awsSecretAccessKey</name> > <value>{{aws_secret_access_key}}</value> > </property> > > <property> > <name>fs.s3a.awsAccessKeyId</name> > <value>{{aws_access_key_id}}</value> > </property> > > <property> > <name>fs.s3a.awsSecretAccessKey</name> > <value>{{aws_secret_access_key}}</value> > </property> > > This change makes spark on ec2 work out of the box for us. It took us > several days to figure this out. It works for 1.4.1 and 1.5.1 on Hadoop > version 2. > > Best Regards, > Christian --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org