Hmm, I just tried changing s3n to s3a: py4j.protocol.Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe. : java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem not found
Nick On Thu, May 7, 2015 at 12:29 PM Peter Rudenko <petro.rude...@gmail.com> wrote: > Hi Nick, had the same issue. > By default it should work with s3a protocol: > > sc.textFile('s3a://bucket/file_*').count() > > > If you want to use s3n protocol you need to add hadoop-aws.jar to spark's > classpath. Wich hadoop vendor (Hortonworks, Cloudera, MapR) do you use? > > Thanks, > Peter Rudenko > > On 2015-05-07 19:25, Nicholas Chammas wrote: > > Details are here: https://issues.apache.org/jira/browse/SPARK-7442 > > It looks like something specific to building against Hadoop 2.6? > > Nick > > > >