I have tried both cases(s3 and s3n, set all possible parameters), and trust me, the same code works with 1.3.1, but not for 1.3.0 and 1.4.0, 1.5.0.
I even use a plain project to test this, and use maven to include all referenced library, but it give me error. I think everyone can easily to replicate my issue locally (the code doesn’t need to run on EC2, I run it directly from my local windows pc). Regards, Shuai From: Aaron Davidson [mailto:ilike...@gmail.com] Sent: Wednesday, June 10, 2015 12:28 PM To: Shuai Zheng Subject: Re: [SPARK-6330] 1.4.0/1.5.0 Bug to access S3 -- AWS Access Key ID and Secret Access Key must be specified as the username or password (respectively) of a s3 URL, or by setting the fs.s3.awsAccessKeyId or fs.s3.awsSecretAccessKey properties (respectively) That exception is a bit weird as it refers to fs.s3 instead of fs.s3n. Maybe you are accidentally using s3://? Otherwise, you might try also specifying that property too. On Jun 9, 2015 12:45 PM, "Shuai Zheng" <szheng.c...@gmail.com> wrote: Hi All, I have some code to access s3 from Spark. The code is as simple as: JavaSparkContext ctx = new JavaSparkContext(sparkConf); Configuration hadoopConf = ctx.hadoopConfiguration(); hadoopConf.set("fs.s3n.impl", "org.apache.hadoop.fs.s3native.NativeS3FileSystem"); hadoopConf.set("fs.s3n.awsAccessKeyId", "-----------------------"); hadoopConf.set("fs.s3n.awsSecretAccessKey", "------------------------------"); SQLContext sql = new SQLContext(ctx); DataFrame grid_lookup = sql.parquetFile("s3n://-------------------"); grid_lookup.count(); ctx.stop(); The code works for 1.3.1. And for 1.4.0 and latest 1.5.0, it always give me below exception: Exception in thread "main" java.lang.IllegalArgumentException: AWS Access Key ID and Secret Access Key must be specified as the username or password (respectively) of a s3 URL, or by setting the fs.s3.awsAccessKeyId or fs.s3.awsSecretAccessKey properties (respectively). I don’t know why, I remember this is a known issue in 1.3.0: https://issues.apache.org/jira/browse/SPARK-6330, and solved in 1.3.1 But now it is not working again for a newer version? I remember while I switched to 1.4.0, for a while it works (while I worked with the master branch of the latest source code), and I just refresh latest code, and I am given this error again. Anyone has idea? Regards, Shuai