The real question is why are looking to consume file as a Stream
1. Too big to load as RDD
2. Operate in sequential manner.
Mayur Rustagi
Ph: +1 (760) 203 3257
http://www.sigmoidanalytics.com
@mayur_rustagi https://twitter.com/mayur_rustagi
On Sat, May 17, 2014 at 5:12 AM, Soumya Simanta
@Soumya Simanta
Right now its just a prove of concept. Later I will have a real stream. Its EEG
files of brain. Later it can be used for real time analysis of eeg streams.
@Mayur
The size is huge yes. SO its better to do in distributed manner and as I said
above I want to read as stream
@Laeeq - please see this example.
https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/streaming/HdfsWordCount.scala#L47-L49
On Sat, May 17, 2014 at 2:06 PM, Laeeq Ahmed laeeqsp...@yahoo.com wrote:
@Soumya Simanta
Right now its just a prove of
File is just a steam with a fixed length. Usually streams don't end but in this
case it would.
On the other hand if you real your file as a steam may not be able to use the
entire data in the file for your analysis. Spark (give enough memory) can
process large amounts of data quickly.
On
Hi,
I have data in a file. Can I read it as Stream in spark? I know it seems odd to
read file as stream but it has practical applications in real life if I can
read it as stream. It there any other tools which can give this file as stream
to Spark or I have to make batches manually which is