Re: Help accessing protected S3

2015-07-23 Thread Steve Loughran
On 23 Jul 2015, at 10:47, Greg Anderson gregory.ander...@familysearch.org wrote: So when I go to ~/ephemeral-hdfs/bin/hadoop and check its version, it says Hadoop 2.0.0-cdh4.2.0. If I run pyspark and use the s3a address, things should work, right? What am I missing? And thanks so

Re: Help accessing protected S3

2015-07-23 Thread Steve Loughran
On 23 Jul 2015, at 01:50, Ewan Leith ewan.le...@realitymine.com wrote: I think the standard S3 driver used in Spark from the Hadoop project (S3n) doesn't support IAM role based authentication. However, S3a should support it. If you're running Hadoop 2.6 via the spark-ec2 scripts (I'm

RE: Help accessing protected S3

2015-07-23 Thread Greg Anderson
...@hortonworks.com] Sent: Thursday, July 23, 2015 11:37 AM To: Ewan Leith Cc: Greg Anderson; user@spark.apache.org Subject: Re: Help accessing protected S3 On 23 Jul 2015, at 01:50, Ewan Leith ewan.le...@realitymine.com wrote: I think the standard S3 driver used in Spark from the Hadoop project

RE: Help accessing protected S3

2015-07-23 Thread Ewan Leith
:// URLs instead of s3n:// http://wiki.apache.org/hadoop/AmazonS3 https://issues.apache.org/jira/browse/HADOOP-10400 Thanks, Ewan -Original Message- From: Greg Anderson [mailto:gregory.ander...@familysearch.org] Sent: 22 July 2015 18:00 To: user@spark.apache.org Subject: Help accessing

Help accessing protected S3

2015-07-22 Thread Greg Anderson
I have a protected s3 bucket that requires a certain IAM role to access. When I start my cluster using the spark-ec2 script, everything works just fine until I try to read from that part of s3. Here is the command I am using: ./spark-ec2 -k KEY -i KEY_FILE.pem