If you can use hadoop 2.6.0 binary, you can use s3a
s3a is being polished in the upcoming 2.7.0 release:
https://issues.apache.org/jira/browse/HADOOP-11571
Cheers
On Tue, Mar 3, 2015 at 9:44 AM, Ankur Srivastava ankur.srivast...@gmail.com
wrote:
Hi,
We recently upgraded to Spark 1.2.1 -
Thanks a lot Ted!!
On Tue, Mar 3, 2015 at 9:53 AM, Ted Yu yuzhih...@gmail.com wrote:
If you can use hadoop 2.6.0 binary, you can use s3a
s3a is being polished in the upcoming 2.7.0 release:
https://issues.apache.org/jira/browse/HADOOP-11571
Cheers
On Tue, Mar 3, 2015 at 9:44 AM, Ankur
Hi,
We recently upgraded to Spark 1.2.1 - Hadoop 2.4 binary. We are not having
any other dependency on hadoop jars, except for reading our source files
from S3.
Since we have upgraded to the latest version our reads from S3 have
considerably slowed down. For some jobs we see the read from S3 is