subject:"spark\-0.9.1 compiled with Hadoop 2.3.0 doesn't work with S3\?"

spark-0.9.1 compiled with Hadoop 2.3.0 doesn't work with S3?

2014-04-21 Thread Nan Zhu

Hi, all I’m writing a Spark application to load S3 data to HDFS, the HDFS version is 2.3.0, so I have to compile Spark with Hadoop 2.3.0 after I execute val allfiles = sc.textFile(s3n://abc/*.txt”) val output = allfiles.saveAsTextFile(hdfs://x.x.x.x:9000/dataset”) Spark throws exception:

Re: spark-0.9.1 compiled with Hadoop 2.3.0 doesn't work with S3?

2014-04-21 Thread Parviz Deyhim

I ran into the same issue. The problem seems to be with the jets3t library that Spark uses in project/SparkBuild.scala. change this: net.java.dev.jets3t % jets3t % 0.7.1 to net.java.dev.jets3t % jets3t % 0.9.0 0.7.1 is not the right version of jets3t for Hadoop

Re: spark-0.9.1 compiled with Hadoop 2.3.0 doesn't work with S3?

2014-04-21 Thread Nan Zhu

Yes, I fixed in the same way, but didn’t get a change to get back to here I also made a PR: https://github.com/apache/spark/pull/468 Best, -- Nan Zhu On Monday, April 21, 2014 at 8:19 PM, Parviz Deyhim wrote: I ran into the same issue. The problem seems to be with the jets3t library