Hmm, I just tried changing s3n to s3a:

py4j.protocol.Py4JJavaError: An error occurred while calling
z:org.apache.spark.api.python.PythonRDD.collectAndServe.
: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class
org.apache.hadoop.fs.s3a.S3AFileSystem not found

Nick
​

On Thu, May 7, 2015 at 12:29 PM Peter Rudenko <petro.rude...@gmail.com>
wrote:

>  Hi Nick, had the same issue.
> By default it should work with s3a protocol:
>
> sc.textFile('s3a://bucket/file_*').count()
>
>
> If you want to use s3n protocol you need to add hadoop-aws.jar to spark's
> classpath. Wich hadoop vendor (Hortonworks, Cloudera, MapR) do you use?
>
> Thanks,
> Peter Rudenko
>
> On 2015-05-07 19:25, Nicholas Chammas wrote:
>
> Details are here: https://issues.apache.org/jira/browse/SPARK-7442
>
> It looks like something specific to building against Hadoop 2.6?
>
> Nick
>
>
>
>

Reply via email to