Thanks Akhil, it solved the problem.

best
/Shahab

On Fri, Jun 12, 2015 at 8:50 PM, Akhil Das <ak...@sigmoidanalytics.com>
wrote:

> Looks like your spark is not able to pick up the HADOOP_CONF. To fix this,
> you can actually add jets3t-0.9.0.jar to the classpath
> (sc.addJar(/path/to/jets3t-0.9.0.jar).
>
> Thanks
> Best Regards
>
> On Thu, Jun 11, 2015 at 6:44 PM, shahab <shahab.mok...@gmail.com> wrote:
>
>> Hi,
>>
>> I tried to read a csv file from amazon s3, but I get the following
>> exception which I have no clue how to solve this. I tried both spark 1.3.1
>> and 1.2.1, but no success.  Any idea how to solve this is appreciated.
>>
>>
>> best,
>> /Shahab
>>
>> the code:
>>
>> val hadoopConf=sc.hadoopConfiguration;
>>
>>  hadoopConf.set("fs.s3.impl",
>> "org.apache.hadoop.fs.s3native.NativeS3FileSystem")
>>
>>  hadoopConf.set("fs.s3.awsAccessKeyId", aws_access_key_id)
>>
>>  hadoopConf.set("fs.s3.awsSecretAccessKey", aws_secret_access_key)
>>
>>  val csv = sc.textFile(""s3n://mybucket/info.csv")  // original file
>>
>>  val data = csv.map(line => line.split(",").map(elem => elem.trim)) //lines
>> in rows
>>
>>
>> Here is the exception I faced:
>>
>> Exception in thread "main" java.lang.NoClassDefFoundError:
>> org/jets3t/service/ServiceException
>>
>> at org.apache.hadoop.fs.s3native.NativeS3FileSystem.createDefaultStore(
>> NativeS3FileSystem.java:280)
>>
>> at org.apache.hadoop.fs.s3native.NativeS3FileSystem.initialize(
>> NativeS3FileSystem.java:270)
>>
>> at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2397)
>>
>> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:89)
>>
>> at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2431
>> )
>>
>> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2413)
>>
>> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:368)
>>
>> at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
>>
>> at org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(
>> FileInputFormat.java:256)
>>
>> at org.apache.hadoop.mapred.FileInputFormat.listStatus(
>> FileInputFormat.java:228)
>>
>> at org.apache.hadoop.mapred.FileInputFormat.getSplits(
>> FileInputFormat.java:304)
>>
>> at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:203)
>>
>> at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
>>
>> at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
>>
>> at scala.Option.getOrElse(Option.scala:120)
>>
>> at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)
>>
>> at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(
>> MapPartitionsRDD.scala:32)
>>
>> at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
>>
>> at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
>>
>> at scala.Option.getOrElse(Option.scala:120)
>>
>> at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)
>>
>> at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(
>> MapPartitionsRDD.scala:32)
>>
>> at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
>>
>> at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
>>
>> at scala.Option.getOrElse(Option.scala:120)
>>
>> at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)
>>
>> at org.apache.spark.SparkContext.runJob(SparkContext.scala:1512)
>>
>> at org.apache.spark.rdd.RDD.count(RDD.scala:1006)
>>
>
>

Reply via email to