Re: How do I Process Streams that span multiple lines?

2015-08-04 Thread Akhil Das
If you are using Kafka, then you can basically push an entire file as a
message to Kafka. In that case in your DStream, you will receive the single
message which is the contents of the file and it can of course span
multiple lines.

Thanks
Best Regards

On Mon, Aug 3, 2015 at 8:27 PM, Spark Enthusiast sparkenthusi...@yahoo.in
wrote:

 All  examples of Spark Stream programming that I see assume streams of
 lines that are then tokenised and acted upon (like the WordCount example).

 How do I process Streams that span multiple lines? Are there examples that
 I can use?



How do I Process Streams that span multiple lines?

2015-08-03 Thread Spark Enthusiast
All  examples of Spark Stream programming that I see assume streams of lines 
that are then tokenised and acted upon (like the WordCount example).
How do I process Streams that span multiple lines? Are there examples that I 
can use? 

Re: How do I Process Streams that span multiple lines?

2015-08-03 Thread Michal Čizmazia
Are you looking for RDD.wholeTextFiles?

On 3 August 2015 at 10:57, Spark Enthusiast sparkenthusi...@yahoo.in
wrote:

 All  examples of Spark Stream programming that I see assume streams of
 lines that are then tokenised and acted upon (like the WordCount example).

 How do I process Streams that span multiple lines? Are there examples that
 I can use?



Re: How do I Process Streams that span multiple lines?

2015-08-03 Thread Michal Čizmazia
Sorry.

SparkContext.wholeTextFiles

Not sure about streams.

On 3 August 2015 at 14:50, Michal Čizmazia mici...@gmail.com wrote:

 Are you looking for RDD.wholeTextFiles?

 On 3 August 2015 at 10:57, Spark Enthusiast sparkenthusi...@yahoo.in
 wrote:

 All  examples of Spark Stream programming that I see assume streams of
 lines that are then tokenised and acted upon (like the WordCount example).

 How do I process Streams that span multiple lines? Are there examples
 that I can use?