There is detailed information available in the official documentation[1]. If you don't have a key pair, you can generate one as described in AWS documentation [2]. That should be enough to get started.
[1] http://spark.apache.org/docs/latest/ec2-scripts.html [2] http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-key-pairs.html On Mon, Oct 13, 2014 at 4:07 PM, Ranga <sra...@gmail.com> wrote: > Hi Daniil > > Could you provide some more details on how the cluster should be > launched/configured? The EC2 instance that I am dealing with uses the > concept of IAMRoles. I don't have any "keyfile" to specify to the spark-ec2 > script. > Thanks for your help. > > > - Ranga > > On Mon, Oct 13, 2014 at 3:04 PM, Daniil Osipov <daniil.osi...@shazam.com> > wrote: > >> (Copying the user list) >> You should use spark_ec2 script to configure the cluster. If you use >> trunk version you can use the new --copy-aws-credentials option to >> configure the S3 parameters automatically, otherwise either include them in >> your SparkConf variable or add them to >> /root/spark/ephemeral-hdfs/conf/core-site.xml >> >> On Mon, Oct 13, 2014 at 2:56 PM, Ranga <sra...@gmail.com> wrote: >> >>> The cluster is deployed on EC2 and I am trying to access the S3 files >>> from within a spark-shell session. >>> >>> On Mon, Oct 13, 2014 at 2:51 PM, Daniil Osipov <daniil.osi...@shazam.com >>> > wrote: >>> >>>> So is your cluster running on EC2, or locally? If you're running >>>> locally, you should still be able to access S3 files, you just need to >>>> locate the core-site.xml and add the parameters as defined in the error. >>>> >>>> On Mon, Oct 13, 2014 at 2:49 PM, Ranga <sra...@gmail.com> wrote: >>>> >>>>> Hi Daniil >>>>> >>>>> No. I didn't create the spark-cluster using the ec2 scripts. Is that >>>>> something that I need to do? I just downloaded Spark-1.1.0 and Hadoop-2.4. >>>>> However, I am trying to access files on S3 from this cluster. >>>>> >>>>> >>>>> - Ranga >>>>> >>>>> On Mon, Oct 13, 2014 at 2:36 PM, Daniil Osipov < >>>>> daniil.osi...@shazam.com> wrote: >>>>> >>>>>> Did you add the fs.s3n.aws* configuration parameters in >>>>>> /root/spark/ephemeral-hdfs/conf/core-ste.xml? >>>>>> >>>>>> On Mon, Oct 13, 2014 at 11:03 AM, Ranga <sra...@gmail.com> wrote: >>>>>> >>>>>>> Hi >>>>>>> >>>>>>> I am trying to access files/buckets in S3 and encountering a >>>>>>> permissions issue. The buckets are configured to authenticate using an >>>>>>> IAMRole provider. >>>>>>> I have set the KeyId and Secret using environment variables ( >>>>>>> AWS_SECRET_ACCESS_KEY and AWS_ACCESS_KEY_ID). However, I am still >>>>>>> unable to access the S3 buckets. >>>>>>> >>>>>>> Before setting the access key and secret the error was: >>>>>>> "java.lang.IllegalArgumentException: >>>>>>> AWS Access Key ID and Secret Access Key must be specified as the >>>>>>> username >>>>>>> or password (respectively) of a s3n URL, or by setting the >>>>>>> fs.s3n.awsAccessKeyId or fs.s3n.awsSecretAccessKey properties >>>>>>> (respectively)." >>>>>>> >>>>>>> After setting the access key and secret, the error is: "The AWS >>>>>>> Access Key Id you provided does not exist in our records." >>>>>>> >>>>>>> The id/secret being set are the right values. This makes me believe >>>>>>> that something else ("token", etc.) needs to be set as well. >>>>>>> Any help is appreciated. >>>>>>> >>>>>>> >>>>>>> - Ranga >>>>>>> >>>>>> >>>>>> >>>>> >>>> >>> >> >