On EMR, you can add fs.* params in emrfs-site.xml. On Tue, Jan 12, 2016 at 7:27 AM, Jonathan Kelly <jonathaka...@gmail.com> wrote:
> Yes, IAM roles are actually required now for EMR. If you use Spark on EMR > (vs. just EC2), you get S3 configuration for free (it goes by the name > EMRFS), and it will use your IAM role for communicating with S3. Here is > the corresponding documentation: > http://docs.aws.amazon.com/ElasticMapReduce/latest/ManagementGuide/emr-fs.html > > On Mon, Jan 11, 2016 at 11:37 AM Matei Zaharia <matei.zaha...@gmail.com> > wrote: > >> In production, I'd recommend using IAM roles to avoid having keys >> altogether. Take a look at >> http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html >> . >> >> Matei >> >> On Jan 11, 2016, at 11:32 AM, Sabarish Sasidharan < >> sabarish.sasidha...@manthan.com> wrote: >> >> If you are on EMR, these can go into your hdfs site config. And will work >> with Spark on YARN by default. >> >> Regards >> Sab >> On 11-Jan-2016 5:16 pm, "Krishna Rao" <krishnanj...@gmail.com> wrote: >> >>> Hi all, >>> >>> Is there a method for reading from s3 without having to hard-code keys? >>> The only 2 ways I've found both require this: >>> >>> 1. Set conf in code e.g.: >>> sc.hadoopConfiguration().set("fs.s3.awsAccessKeyId", "<aws_key>") >>> sc.hadoopConfiguration().set("fs.s3.awsSecretAccessKey", >>> "<aws_secret_key>") >>> >>> 2. Set keys in URL, e.g.: >>> sc.textFile("s3n://<aws_key>@<aws_secret_key>/bucket/test/testdata") >>> >>> >>> Both if which I'm reluctant to do within production code! >>> >>> >>> Cheers >>> >> >> -- Best Regards, Ayan Guha