Re: newbie unable to write to S3 403 forbidden error

2016-02-24 Thread Andy Davidson
Hi Sabarish

We finally got S3 working. I think the real problem was that by default
spark-ec2 uses an old version of hadoop (1.0.4). The we passed
--copy-aws-credentials --hadoop-major-version=2  it started working

Kind regards

Andy


From:  Sabarish Sasidharan <sabarish.sasidha...@manthan.com>
Date:  Sunday, February 14, 2016 at 7:05 PM
To:  Andrew Davidson <a...@santacruzintegration.com>
Cc:  "user @spark" <user@spark.apache.org>
Subject:  Re: newbie unable to write to S3 403 forbidden error

> 
> Make sure you are using s3 bucket in same region. Also I would access my
> bucket this way s3n://bucketname/foldername.
> 
> You can test privileges using the s3 cmd line client.
> 
> Also, if you are using instance profiles you don't need to specify access and
> secret keys. No harm in specifying though.
> 
> Regards
> Sab
> On 12-Feb-2016 2:46 am, "Andy Davidson" <a...@santacruzintegration.com> wrote:
>> I am using spark 1.6.0 in a cluster created using the spark-ec2 script. I am
>> using the standalone cluster manager
>> 
>> My java streaming app is not able to write to s3. It appears to be some for
>> of permission problem.
>> 
>> Any idea what the problem might be?
>> 
>> I tried use the IAM simulator to test the policy. Everything seems okay. Any
>> idea how I can debug this problem?
>> 
>> Thanks in advance
>> 
>> Andy
>> 
>> JavaSparkContext jsc = new JavaSparkContext(conf);
>> 
>> 
>> // I did not include the full key in my email
>>// the keys do not contain Œ\¹
>>// these are the keys used to create the cluster. They belong to the
>> IAM user andy
>> jsc.hadoopConfiguration().set("fs.s3n.awsAccessKeyId", "AKIAJREX");
>> 
>> jsc.hadoopConfiguration().set("fs.s3n.awsSecretAccessKey",
>> "uBh9v1hdUctI23uvq9qR");
>> 
>> 
>> 
>> 
>>   private static void saveTweets(JavaDStream jsonTweets, String
>> outputURI) {
>> 
>> jsonTweets.foreachRDD(new VoidFunction2<JavaRDD, Time>() {
>> 
>> private static final long serialVersionUID = 1L;
>> 
>> 
>> 
>> @Override
>> 
>> public void call(JavaRDD rdd, Time time) throws Exception
>> {
>> 
>> if(!rdd.isEmpty()) {
>> 
>> // bucket name is Œcom.pws.twitter¹ it has a folder Œjson'
>> 
>> String dirPath =
>> "s3n://s3-us-west-1.amazonaws.com/com.pws.twitter/
>> <http://s3-us-west-1.amazonaws.com/com.pws.twitter/> json² + "-" +
>> time.milliseconds();
>> 
>> rdd.saveAsTextFile(dirPath);
>> 
>> }
>> 
>> }
>> 
>> });
>> 
>> 
>> 
>> 
>> Bucket name : com.pws.titter
>> Bucket policy (I replaced the account id)
>> 
>> {
>> "Version": "2012-10-17",
>> "Id": "Policy1455148808376",
>> "Statement": [
>> {
>> "Sid": "Stmt1455148797805",
>> "Effect": "Allow",
>> "Principal": {
>> "AWS": "arn:aws:iam::123456789012:user/andy"
>> },
>> "Action": "s3:*",
>> "Resource": "arn:aws:s3:::com.pws.twitter/*"
>> }
>> ]
>> }
>> 
>> 




Re: newbie unable to write to S3 403 forbidden error

2016-02-14 Thread Sabarish Sasidharan
Make sure you are using s3 bucket in same region. Also I would access my
bucket this way s3n://bucketname/foldername.

You can test privileges using the s3 cmd line client.

Also, if you are using instance profiles you don't need to specify access
and secret keys. No harm in specifying though.

Regards
Sab
On 12-Feb-2016 2:46 am, "Andy Davidson" 
wrote:

> I am using spark 1.6.0 in a cluster created using the spark-ec2 script. I
> am using the standalone cluster manager
>
> My java streaming app is not able to write to s3. It appears to be some
> for of permission problem.
>
> Any idea what the problem might be?
>
> I tried use the IAM simulator to test the policy. Everything seems okay.
> Any idea how I can debug this problem?
>
> Thanks in advance
>
> Andy
>
> JavaSparkContext jsc = new JavaSparkContext(conf);
>
> // I did not include the full key in my email
>// the keys do not contain ‘\’
>// these are the keys used to create the cluster. They belong to
> the IAM user andy
>
> jsc.hadoopConfiguration().set("fs.s3n.awsAccessKeyId", "AKIAJREX"
> );
>
> jsc.hadoopConfiguration().set("fs.s3n.awsSecretAccessKey",
> "uBh9v1hdUctI23uvq9qR");
>
>
>
>   private static void saveTweets(JavaDStream jsonTweets, String
> outputURI) {
>
> jsonTweets.foreachRDD(new VoidFunction2() {
>
> private static final long serialVersionUID = 1L;
>
>
> @Override
>
> public void call(JavaRDD rdd, Time time) throws
> Exception {
>
> if(!rdd.isEmpty()) {
>
> // bucket name is ‘com.pws.twitter’ it has a folder ‘json'
>
> String dirPath = "s3n://
> s3-us-west-1.amazonaws.com/com.pws.twitter/*json” *+ "-" + time
> .milliseconds();
>
> rdd.saveAsTextFile(dirPath);
>
> }
>
> }
>
> });
>
>
>
>
> Bucket name : com.pws.titter
> Bucket policy (I replaced the account id)
>
> {
> "Version": "2012-10-17",
> "Id": "Policy1455148808376",
> "Statement": [
> {
> "Sid": "Stmt1455148797805",
> "Effect": "Allow",
> "Principal": {
> "AWS": "arn:aws:iam::123456789012:user/andy"
> },
> "Action": "s3:*",
> "Resource": "arn:aws:s3:::com.pws.twitter/*"
> }
> ]
> }
>
>
>


Re: newbie unable to write to S3 403 forbidden error

2016-02-13 Thread Patrick Plaatje
Not sure if it’s related, but in our Hadoop configuration we’re also setting 

sc.hadoopConfiguration().set("fs.s3.impl","org.apache.hadoop.fs.s3native.NativeS3FileSystem”);

Cheers,
-patrick

From:  Andy Davidson <a...@santacruzintegration.com>
Date:  Friday, 12 February 2016 at 17:34
To:  Igor Berman <igor.ber...@gmail.com>
Cc:  "user @spark" <user@spark.apache.org>
Subject:  Re: newbie unable to write to S3 403 forbidden error

Hi Igor

So I assume you are able to use s3 from spark? 

Do you use rdd.saveAsTextFile() ?

How did you create your cluster? I.E. Did you use the spark-1.6.0/spark-ec2 
script, EMR, or something else?


I tried several version of the url including no luck :-(

The bucket name is ‘com.ps.twitter’. It has a folder ‘son'

We have a developer support contract with amazon how ever our case has been 
unassigned for several days now

Thanks

Andy

P.s. In general debugging permission problems is always difficult from the 
client side. Secure servers do not want to make it easy for hackers

From:  Igor Berman <igor.ber...@gmail.com>
Date:  Friday, February 12, 2016 at 4:53 AM
To:  Andrew Davidson <a...@santacruzintegration.com>
Cc:  "user @spark" <user@spark.apache.org>
Subject:  Re: newbie unable to write to S3 403 forbidden error

 String dirPath = "s3n://s3-us-west-1.amazonaws.com/com.pws.twitter/json” 

not sure, but 
can you try to remove s3-us-west-1.amazonaws.com from path ?

On 11 February 2016 at 23:15, Andy Davidson <a...@santacruzintegration.com> 
wrote:
I am using spark 1.6.0 in a cluster created using the spark-ec2 script. I am 
using the standalone cluster manager

My java streaming app is not able to write to s3. It appears to be some for of 
permission problem. 

Any idea what the problem might be?

I tried use the IAM simulator to test the policy. Everything seems okay. Any 
idea how I can debug this problem?

Thanks in advance

Andy

JavaSparkContext jsc = new JavaSparkContext(conf);


// I did not include the full key in my email
   // the keys do not contain ‘\’
   // these are the keys used to create the cluster. They belong to the IAM 
user andy
jsc.hadoopConfiguration().set("fs.s3n.awsAccessKeyId", "AKIAJREX");

jsc.hadoopConfiguration().set("fs.s3n.awsSecretAccessKey", 
"uBh9v1hdUctI23uvq9qR");




  private static void saveTweets(JavaDStream jsonTweets, String 
outputURI) {

jsonTweets.foreachRDD(new VoidFunction2<JavaRDD, Time>() {

private static final long serialVersionUID = 1L;



@Override

public void call(JavaRDD rdd, Time time) throws Exception {

if(!rdd.isEmpty()) {

// bucket name is ‘com.pws.twitter’ it has a folder ‘json'

String dirPath = 
"s3n://s3-us-west-1.amazonaws.com/com.pws.twitter/json” + "-" + 
time.milliseconds();

rdd.saveAsTextFile(dirPath);

}

}

});




Bucket name : com.pws.titter
Bucket policy (I replaced the account id)

{
"Version": "2012-10-17",
"Id": "Policy1455148808376",
"Statement": [
{
"Sid": "Stmt1455148797805",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::123456789012:user/andy"
},
"Action": "s3:*",
"Resource": "arn:aws:s3:::com.pws.twitter/*"
}
]
}






Re: newbie unable to write to S3 403 forbidden error

2016-02-12 Thread Igor Berman
 String dirPath = "s3n://s3-us-west-1.amazonaws.com/com.pws.twitter/*json” *

not sure, but
can you try to remove s3-us-west-1.amazonaws.com
 from path ?

On 11 February 2016 at 23:15, Andy Davidson 
wrote:

> I am using spark 1.6.0 in a cluster created using the spark-ec2 script. I
> am using the standalone cluster manager
>
> My java streaming app is not able to write to s3. It appears to be some
> for of permission problem.
>
> Any idea what the problem might be?
>
> I tried use the IAM simulator to test the policy. Everything seems okay.
> Any idea how I can debug this problem?
>
> Thanks in advance
>
> Andy
>
> JavaSparkContext jsc = new JavaSparkContext(conf);
>
> // I did not include the full key in my email
>// the keys do not contain ‘\’
>// these are the keys used to create the cluster. They belong to
> the IAM user andy
>
> jsc.hadoopConfiguration().set("fs.s3n.awsAccessKeyId", "AKIAJREX"
> );
>
> jsc.hadoopConfiguration().set("fs.s3n.awsSecretAccessKey",
> "uBh9v1hdUctI23uvq9qR");
>
>
>
>   private static void saveTweets(JavaDStream jsonTweets, String
> outputURI) {
>
> jsonTweets.foreachRDD(new VoidFunction2() {
>
> private static final long serialVersionUID = 1L;
>
>
> @Override
>
> public void call(JavaRDD rdd, Time time) throws
> Exception {
>
> if(!rdd.isEmpty()) {
>
> // bucket name is ‘com.pws.twitter’ it has a folder ‘json'
>
> String dirPath = "s3n://
> s3-us-west-1.amazonaws.com/com.pws.twitter/*json” *+ "-" + time
> .milliseconds();
>
> rdd.saveAsTextFile(dirPath);
>
> }
>
> }
>
> });
>
>
>
>
> Bucket name : com.pws.titter
> Bucket policy (I replaced the account id)
>
> {
> "Version": "2012-10-17",
> "Id": "Policy1455148808376",
> "Statement": [
> {
> "Sid": "Stmt1455148797805",
> "Effect": "Allow",
> "Principal": {
> "AWS": "arn:aws:iam::123456789012:user/andy"
> },
> "Action": "s3:*",
> "Resource": "arn:aws:s3:::com.pws.twitter/*"
> }
> ]
> }
>
>
>


Re: newbie unable to write to S3 403 forbidden error

2016-02-12 Thread Andy Davidson
Hi Igor

So I assume you are able to use s3 from spark?

Do you use rdd.saveAsTextFile() ?

How did you create your cluster? I.E. Did you use the spark-1.6.0/spark-ec2
script, EMR, or something else?


I tried several version of the url including no luck :-(

The bucket name is Œcom.ps.twitter¹. It has a folder Œson'

We have a developer support contract with amazon how ever our case has been
unassigned for several days now

Thanks

Andy

P.s. In general debugging permission problems is always difficult from the
client side. Secure servers do not want to make it easy for hackers

From:  Igor Berman <igor.ber...@gmail.com>
Date:  Friday, February 12, 2016 at 4:53 AM
To:  Andrew Davidson <a...@santacruzintegration.com>
Cc:  "user @spark" <user@spark.apache.org>
Subject:  Re: newbie unable to write to S3 403 forbidden error

>  String dirPath = "s3n://s3-us-west-1.amazonaws.com/com.pws.twitter/
> <http://s3-us-west-1.amazonaws.com/com.pws.twitter/> json²
> 
> not sure, but 
> can you try to remove s3-us-west-1.amazonaws.com
> <http://s3-us-west-1.amazonaws.com/com.pws.twitter/>  from path ?
> 
> On 11 February 2016 at 23:15, Andy Davidson <a...@santacruzintegration.com>
> wrote:
>> I am using spark 1.6.0 in a cluster created using the spark-ec2 script. I am
>> using the standalone cluster manager
>> 
>> My java streaming app is not able to write to s3. It appears to be some for
>> of permission problem.
>> 
>> Any idea what the problem might be?
>> 
>> I tried use the IAM simulator to test the policy. Everything seems okay. Any
>> idea how I can debug this problem?
>> 
>> Thanks in advance
>> 
>> Andy
>> 
>> JavaSparkContext jsc = new JavaSparkContext(conf);
>> 
>> 
>> // I did not include the full key in my email
>>// the keys do not contain Œ\¹
>>// these are the keys used to create the cluster. They belong to the
>> IAM user andy
>> jsc.hadoopConfiguration().set("fs.s3n.awsAccessKeyId", "AKIAJREX");
>> 
>> jsc.hadoopConfiguration().set("fs.s3n.awsSecretAccessKey",
>> "uBh9v1hdUctI23uvq9qR");
>> 
>> 
>> 
>> 
>>   private static void saveTweets(JavaDStream jsonTweets, String
>> outputURI) {
>> 
>> jsonTweets.foreachRDD(new VoidFunction2<JavaRDD, Time>() {
>> 
>> private static final long serialVersionUID = 1L;
>> 
>> 
>> 
>> @Override
>> 
>> public void call(JavaRDD rdd, Time time) throws Exception
>> {
>> 
>> if(!rdd.isEmpty()) {
>> 
>> // bucket name is Œcom.pws.twitter¹ it has a folder Œjson'
>> 
>> String dirPath =
>> "s3n://s3-us-west-1.amazonaws.com/com.pws.twitter/
>> <http://s3-us-west-1.amazonaws.com/com.pws.twitter/> json² + "-" +
>> time.milliseconds();
>> 
>> rdd.saveAsTextFile(dirPath);
>> 
>> }
>> 
>> }
>> 
>> });
>> 
>> 
>> 
>> 
>> Bucket name : com.pws.titter
>> Bucket policy (I replaced the account id)
>> 
>> {
>> "Version": "2012-10-17",
>> "Id": "Policy1455148808376",
>> "Statement": [
>> {
>> "Sid": "Stmt1455148797805",
>> "Effect": "Allow",
>> "Principal": {
>> "AWS": "arn:aws:iam::123456789012:user/andy"
>> },
>> "Action": "s3:*",
>> "Resource": "arn:aws:s3:::com.pws.twitter/*"
>> }
>> ]
>> }
>> 
>> 
> 




newbie unable to write to S3 403 forbidden error

2016-02-11 Thread Andy Davidson
I am using spark 1.6.0 in a cluster created using the spark-ec2 script. I am
using the standalone cluster manager

My java streaming app is not able to write to s3. It appears to be some for
of permission problem.

Any idea what the problem might be?

I tried use the IAM simulator to test the policy. Everything seems okay. Any
idea how I can debug this problem?

Thanks in advance

Andy

JavaSparkContext jsc = new JavaSparkContext(conf);


// I did not include the full key in my email
   // the keys do not contain Œ\¹
   // these are the keys used to create the cluster. They belong to the
IAM user andy
jsc.hadoopConfiguration().set("fs.s3n.awsAccessKeyId", "AKIAJREX");

jsc.hadoopConfiguration().set("fs.s3n.awsSecretAccessKey",
"uBh9v1hdUctI23uvq9qR");




  private static void saveTweets(JavaDStream jsonTweets, String
outputURI) {

jsonTweets.foreachRDD(new VoidFunction2() {

private static final long serialVersionUID = 1L;



@Override

public void call(JavaRDD rdd, Time time) throws
Exception {

if(!rdd.isEmpty()) {

// bucket name is Œcom.pws.twitter¹ it has a folder Œjson'

String dirPath =
"s3n://s3-us-west-1.amazonaws.com/com.pws.twitter/json² + "-" +
time.milliseconds();

rdd.saveAsTextFile(dirPath);

}  

}

});




Bucket name : com.pws.titter
Bucket policy (I replaced the account id)

{
"Version": "2012-10-17",
"Id": "Policy1455148808376",
"Statement": [
{
"Sid": "Stmt1455148797805",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::123456789012:user/andy"
},
"Action": "s3:*",
"Resource": "arn:aws:s3:::com.pws.twitter/*"
}
]
}