Re: How to read LZO file in Spark?

Vida Ha Thu, 28 Sep 2017 12:56:40 -0700

https://docs.databricks.com/spark/latest/data-sources/read-lzo.html
On Wed, Sep 27, 2017 at 6:36 AM 孫澤恩 <gn00710...@gmail.com> wrote:


> Hi All,
>
> Currently, I follow this blog
> http://blog.cloudera.com/blog/2009/11/hadoop-at-twitter-part-1-splittable-lzo-compression/
>  that
> I could use hdfs dfs -text to read the LZO file.
> But I want to know how to use Spark to read lzo file?
> I put the hadoop-lzo.jar to spark/jars and follow the blog
> https://github.com/awslabs/emr-bootstrap-actions/blob/master/spark/examples/reading-lzo-files.md
> .
>
> Here are my script
> sc.newAPIHadoopFile(“hfs://<my_path_to_file>",
> classOf[com.hadoop.mapreduce.LzoTextInputFormat],classOf[org.apache.hadoop.io.LongWritable],classOf[org.apache.hadoop.io.Text])
> val lzoRDD = files.map(_._2.toString)
>
> The result of it is null.
>
> Does anyone has some experience of this?
>
> Sean Sun
>
>
>

Re: How to read LZO file in Spark?

Reply via email to