Solution:
sc._jsc.hadoopConfiguration().set("fs.s3a.awsAccessKeyId", "...")
sc._jsc.hadoopConfiguration().set("fs.s3a.awsSecretAccessKey", "...")
Got this solution from a cloudera lady. Thanks Neerja.
--
View this message in context:
e.org>
Subject: Re: spark 1.6.0 read s3 files error.
> Any one, please? I believe many of us are using spark 1.6 or higher with
> s3...
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/spark-1-6-0-read-s3-files-
>
Any one, please? I believe many of us are using spark 1.6 or higher with
s3...
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/spark-1-6-0-read-s3-files-error-tp27417p27451.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
tried the following. still failed the same way.. it ran in yarn. cdh5.8.0
from pyspark import SparkContext, SparkConf
conf = SparkConf().setAppName('s3 ---')
sc = SparkContext(conf=conf)
sc._jsc.hadoopConfiguration().set("fs.s3n.awsAccessKeyId", "...")
BTW, I also tried yarn. Same error.
When I ran the script, I used the real credentials for s3, which is omitted
in this post. sorry about that.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/spark-1-6-0-read-s3-files-error-tp27417p27425.html
Sent from
Hi Freedafeng
Can you tells a little more? I.E. Can you paste your code and error message?
Andy
From: freedafeng <freedaf...@yahoo.com>
Date: Thursday, July 28, 2016 at 9:21 AM
To: "user @spark" <user@spark.apache.org>
Subject: Re: spark 1.6.0 read s3 files error.
&
The question is, what is the cause of the problem? and how to fix it? Thanks.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/spark-1-6-0-read-s3-files-error-tp27417p27424.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
quot;user @spark" <user@spark.apache.org>
Subject: spark 1.6.0 read s3 files error.
> cdh 5.7.1. pyspark.
>
> codes: ===
> from pyspark import SparkContext, SparkConf
>
> conf = SparkConf().setAppName('s3 ---')
> sc = SparkContext(conf=conf)
>
cdh 5.7.1. pyspark.
codes: ===
from pyspark import SparkContext, SparkConf
conf = SparkConf().setAppName('s3 ---')
sc = SparkContext(conf=conf)
myRdd =
sc.textFile("s3n:///y=2016/m=5/d=26/h=20/2016.5.26.21.9.52.6d53180a-28b9-4e65-b749-b4a2694b9199.json.gz")
count =