Hi All,
I have some code to access s3 from Spark. The code is as simple as: JavaSparkContext ctx = new JavaSparkContext(sparkConf); Configuration hadoopConf = ctx.hadoopConfiguration(); // aws.secretKey=Zqhjim3GB69hMBvfjh+7NX84p8sMF39BHfXwO3Hs // aws.accessKey=AKIAI4YXBAJTJ77VKS4A hadoopConf.set("fs.s3n.impl", "org.apache.hadoop.fs.s3native.NativeS3FileSystem"); hadoopConf.set("fs.s3n.awsAccessKeyId", "-----------------------"); hadoopConf.set("fs.s3n.awsSecretAccessKey", "------------------------------"); SQLContext sql = new SQLContext(ctx); DataFrame grid_lookup = sql.parquetFile("s3n://-------------------"); grid_lookup.count(); ctx.stop(); The code works for 1.3.1. And for 1.4.0 and latest 1.5.0, it always give me below exception: Exception in thread "main" java.lang.IllegalArgumentException: AWS Access Key ID and Secret Access Key must be specified as the username or password (respectively) of a s3 URL, or by setting the fs.s3.awsAccessKeyId or fs.s3.awsSecretAccessKey properties (respectively). I don't know why, I remember this is a known issue in 1.3.0: https://issues.apache.org/jira/browse/SPARK-6330, and solved in 1.3.1 But now it is not working again for a newer version? I remember while I switched to 1.4.0, for a while it works (while I worked with the master branch of the latest source code), and I just refresh latest code, and I am given this error again. Anyone has idea? Regards, Shuai