Thanks a lot Ted!! On Tue, Mar 3, 2015 at 9:53 AM, Ted Yu <yuzhih...@gmail.com> wrote:
> If you can use hadoop 2.6.0 binary, you can use s3a > > s3a is being polished in the upcoming 2.7.0 release: > https://issues.apache.org/jira/browse/HADOOP-11571 > > Cheers > > On Tue, Mar 3, 2015 at 9:44 AM, Ankur Srivastava < > ankur.srivast...@gmail.com> wrote: > >> Hi, >> >> We recently upgraded to Spark 1.2.1 - Hadoop 2.4 binary. We are not >> having any other dependency on hadoop jars, except for reading our source >> files from S3. >> >> Since we have upgraded to the latest version our reads from S3 have >> considerably slowed down. For some jobs we see the read from S3 is stalled >> for a long time and then it starts. >> >> Is there a known issue with S3 or do we need to upgrade any settings? The >> only settings that we are using are: >> sc.hadoopConfiguration().set("fs.s3n.impl", >> "org.apache.hadoop.fs.s3native.NativeS3FileSystem"); >> >> sc.hadoopConfiguration().set("fs.s3n.awsAccessKeyId", someKey); >> >> sc.hadoopConfiguration().set("fs.s3n.awsSecretAccessKey", someSecret); >> >> >> Thanks for help!! >> >> - Ankur >> > >