Re: Reading sequencefile

Jaonary Rabarisoa Tue, 11 Mar 2014 04:01:29 -0700

Thank you. I fogort the classOf[*] arguments.


On Tue, Mar 11, 2014 at 10:46 AM, Shixiong Zhu <zsxw...@gmail.com> wrote:

> Hi Jaonary,
>
> You can use "sc.sequenceFile" to load your file. E.g.,
>
> scala> import org.apache.hadoop.io._
> import org.apache.hadoop.io._
>
> scala> val rdd = sc.sequenceFile("path_to_file", classOf[Text],
> classOf[BytesWritable])
> rdd: org.apache.spark.rdd.RDD[(org.apache.hadoop.io.Text,
> org.apache.hadoop.io.BytesWritable)] = HadoopRDD[0] at sequenceFile at
> <console>:15
>
>
> Best Regards,
> Shixiong Zhu
>
>
> 2014-03-11 16:54 GMT+08:00 Jaonary Rabarisoa <jaon...@gmail.com>:
>
> Hi all,
>>
>> I'm trying to read a sequenceFile that represent a set of jpeg image
>> generated using this tool :
>> http://stuartsierra.com/2008/04/24/a-million-little-files . According to
>> the documentation : "Each key is the name of a file (a Hadoop “Text”),
>> the value is the binary contents of the file (a BytesWritable)"
>>
>> How do I load the generated file inside spark ?
>>
>> Cheers,
>>
>> Jaonary
>>
>
>

Re: Reading sequencefile

Reply via email to