Thanks for the input. Yes, I did use the "temporary" access credentials provided by the IAM role (also detailed in the link you provided). The session token needs to be specified and I was looking for a way to set that in the header (which doesn't seem possible). Looks like a static key/secret is the only option.
On Tue, Oct 14, 2014 at 10:32 AM, Gen <gen.tan...@gmail.com> wrote: > Hi, > > If I remember well, spark cannot use the IAMrole credentials to access to > s3. It use at first the id/key in the environment. If it is null in the > environment, it use the value in the file core-site.xml. So, IAMrole is > not > useful for spark. The same problem happens if you want to use distcp > command > in hadoop. > > > Do you use curl http://169.254.169.254/latest/meta-data/iam/... to get the > "temporary" access. If yes, this code cannot use directly by spark, for > more > information, you can take a look > http://docs.aws.amazon.com/STS/latest/UsingSTS/using-temp-creds.html > <http://docs.aws.amazon.com/STS/latest/UsingSTS/using-temp-creds.html> > > > > sranga wrote > > Thanks for the pointers. > > I verified that the access key-id/secret used are valid. However, the > > secret may contain "/" at times. The issues I am facing are as follows: > > > > - The EC2 instances are setup with an IAMRole () and don't have a > > static > > key-id/secret > > - All of the EC2 instances have access to S3 based on this role (I > used > > s3ls and s3cp commands to verify this) > > - I can get a "temporary" access key-id/secret based on the IAMRole > but > > they generally expire in an hour > > - If Spark is not able to use the IAMRole credentials, I may have to > > generate a static key-id/secret. This may or may not be possible in > the > > environment I am in (from a policy perspective) > > > > > > > > - Ranga > > > > On Tue, Oct 14, 2014 at 4:21 AM, Rafal Kwasny < > > > mag@ > > > > wrote: > > > >> Hi, > >> keep in mind that you're going to have a bad time if your secret key > >> contains a "/" > >> This is due to old and stupid hadoop bug: > >> https://issues.apache.org/jira/browse/HADOOP-3733 > >> > >> Best way is to regenerate the key so it does not include a "/" > >> > >> /Raf > >> > >> > >> Akhil Das wrote: > >> > >> Try the following: > >> > >> 1. Set the access key and secret key in the sparkContext: > >> > >> sparkContext.set(" > >>> > >>> AWS_ACCESS_KEY_ID",yourAccessKey) > >> > >> sparkContext.set(" > >>> > >>> AWS_SECRET_ACCESS_KEY",yourSecretKey) > >> > >> > >> 2. Set the access key and secret key in the environment before starting > >> your application: > >> > >> > >>> > >> export > >>> > >>> AWS_ACCESS_KEY_ID= > > <your access> > >> > >> export > >>> > >>> AWS_SECRET_ACCESS_KEY= > > <your secret> > > > >> > >> > >> 3. Set the access key and secret key inside the hadoop configurations > >> > >> val hadoopConf=sparkContext.hadoopConfiguration; > >>> > >>> hadoopConf.set("fs.s3.impl", > >>>> "org.apache.hadoop.fs.s3native.NativeS3FileSystem") > >>> > >>> hadoopConf.set("fs.s3.awsAccessKeyId",yourAccessKey) > >>> > >>> hadoopConf.set("fs.s3.awsSecretAccessKey",yourSecretKey) > >>> > >>> > >> 4. You can also try: > >> > >> val lines = > >> > >> s > >>> parkContext.textFile("s3n://yourAccessKey:yourSecretKey@ > >>> > > <yourBucket> > > /path/") > >> > >> > >> Thanks > >> Best Regards > >> > >> On Mon, Oct 13, 2014 at 11:33 PM, Ranga < > > > sranga@ > > > > wrote: > >> > >>> Hi > >>> > >>> I am trying to access files/buckets in S3 and encountering a > permissions > >>> issue. The buckets are configured to authenticate using an IAMRole > >>> provider. > >>> I have set the KeyId and Secret using environment variables ( > >>> AWS_SECRET_ACCESS_KEY and AWS_ACCESS_KEY_ID). However, I am still > unable > >>> to access the S3 buckets. > >>> > >>> Before setting the access key and secret the error was: > >>> "java.lang.IllegalArgumentException: > >>> AWS Access Key ID and Secret Access Key must be specified as the > >>> username > >>> or password (respectively) of a s3n URL, or by setting the > >>> fs.s3n.awsAccessKeyId or fs.s3n.awsSecretAccessKey properties > >>> (respectively)." > >>> > >>> After setting the access key and secret, the error is: "The AWS Access > >>> Key Id you provided does not exist in our records." > >>> > >>> The id/secret being set are the right values. This makes me believe > that > >>> something else ("token", etc.) needs to be set as well. > >>> Any help is appreciated. > >>> > >>> > >>> - Ranga > >>> > >> > >> > >> > > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/S3-Bucket-Access-tp16303p16397.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >