Re: how to read lz4 compressed data using fileStream of spark streaming?

Akhil Das Thu, 14 May 2015 00:42:06 -0700

That's because you are using TextInputFormat i think, try
with LzoTextInputFormat like:


val list_join_action_stream = ssc.fileStream[LongWritable, Text,
com.hadoop.mapreduce.LzoTextInputFormat](gc.input_dir, (t: Path) => true,
false).map(_._2.toString)

Thanks
Best Regards

On Thu, May 14, 2015 at 1:04 PM, lisendong <[email protected]> wrote:

> I have action on DStream.
> because when I put a text file into the hdfs, it runs normally, but if I
> put a lz4 file, it does nothing.
>
> 在 2015年5月14日，下午3:32，Akhil Das <[email protected]> 写道：
>
> What do you mean by not detected? may be you forgot to trigger some action
> on the stream to get it executed. Like:
>
> val list_join_action_stream = ssc.fileStream[LongWritable, Text,
> TextInputFormat](gc.input_dir, (t: Path) => true, false).map(_._2.toString)
>
> *list_join_action_stream.count().print()*
>
>
>
>
> Thanks
> Best Regards
>
> On Wed, May 13, 2015 at 7:18 PM, hotdog <[email protected]> wrote:
>
>> in spark streaming, I want to use fileStream to monitor a directory. But
>> the
>> files in that directory are compressed using lz4. So the new lz4 files are
>> not detected by the following code. How to detect these new files?
>>
>>     val list_join_action_stream = ssc.fileStream[LongWritable, Text,
>> TextInputFormat](gc.input_dir, (t: Path) => true,
>> false).map(_._2.toString)
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/how-to-read-lz4-compressed-data-using-fileStream-of-spark-streaming-tp22868.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [email protected]
>> For additional commands, e-mail: [email protected]
>>
>>
>
>

Re: how to read lz4 compressed data using fileStream of spark streaming?

Reply via email to