Re: Historical Data as Stream

2014-05-17 Thread Soumya Simanta
@Laeeq - please see this example.
https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/streaming/HdfsWordCount.scala#L47-L49



On Sat, May 17, 2014 at 2:06 PM, Laeeq Ahmed  wrote:

> @Soumya Simanta
>
> Right now its just a prove of concept. Later I will have a real stream.
> Its EEG files of brain. Later it can be used for real time analysis of eeg
> streams.
>
> @Mayur
>
> The size is huge yes. SO its better to do in distributed manner and as I
> said above I want to read as stream because later i will have stream data.
> This is a prove a concept.
>
> Regards,
> Laeeq
>
>   On Saturday, May 17, 2014 7:03 PM, Mayur Rustagi <
> mayur.rust...@gmail.com> wrote:
>  The real question is why are looking to consume file as a Stream
> 1. Too big to load as RDD
> 2. Operate in sequential manner.
>
> Mayur Rustagi
> Ph: +1 (760) 203 3257
> http://www.sigmoidanalytics.com
> @mayur_rustagi 
>
>
>
> On Sat, May 17, 2014 at 5:12 AM, Soumya Simanta 
> wrote:
>
> File is just a steam with a fixed length. Usually streams don't end but in
> this case it would.
>
> On the other hand if you real your file as a steam may not be able to use
> the entire data in the file for your analysis. Spark (give enough memory)
> can process large amounts of data quickly.
>
> On May 15, 2014, at 9:52 AM, Laeeq Ahmed  wrote:
>
> Hi,
>
> I have data in a file. Can I read it as Stream in spark? I know it seems
> odd to read file as stream but it has practical applications in real life
> if I can read it as stream. It there any other tools which can give this
> file as stream to Spark or I have to make batches manually which is not
> what I want. Its a coloumn of a million values.
>
> Regards,
> Laeeq
>
>
>
>
>
>


Re: Historical Data as Stream

2014-05-17 Thread Laeeq Ahmed
@Soumya Simanta

Right now its just a prove of concept. Later I will have a real stream. Its EEG 
files of brain. Later it can be used for real time analysis of eeg streams.

@Mayur

The size is huge yes. SO its better to do in distributed manner and as I said 
above I want to read as stream because later i will have stream data. This is a 
prove a concept.

Regards,
Laeeq 


On Saturday, May 17, 2014 7:03 PM, Mayur Rustagi  
wrote:
 
The real question is why are looking to consume file as a Stream
1. Too big to load as RDD 
2. Operate in sequential manner.


Mayur Rustagi
Ph: +1 (760) 203 3257
http://www.sigmoidanalytics.com
@mayur_rustagi



On Sat, May 17, 2014 at 5:12 AM, Soumya Simanta  
wrote:

File is just a steam with a fixed length. Usually streams don't end but in this 
case it would. 
>
>
>On the other hand if you real your file as a steam may not be able to use the 
>entire data in the file for your analysis. Spark (give enough memory) can 
>process large amounts of data quickly. 
>
>On May 15, 2014, at 9:52 AM, Laeeq Ahmed  wrote:
>
>
>Hi,
>>
>>I have data in a file. Can I read it as Stream in spark? I know it seems odd 
>>to read file as stream but it has practical applications in real life if I 
>>can read it as stream. It there any other tools which can give this file as 
>>stream to Spark or I have to make batches manually which is not what I want. 
>>Its a coloumn of a million values.
>>
>>Regards,
>>Laeeq
>> 
>>

Re: Historical Data as Stream

2014-05-17 Thread Mayur Rustagi
The real question is why are looking to consume file as a Stream
1. Too big to load as RDD
2. Operate in sequential manner.

Mayur Rustagi
Ph: +1 (760) 203 3257
http://www.sigmoidanalytics.com
@mayur_rustagi 



On Sat, May 17, 2014 at 5:12 AM, Soumya Simanta wrote:

> File is just a steam with a fixed length. Usually streams don't end but in
> this case it would.
>
> On the other hand if you real your file as a steam may not be able to use
> the entire data in the file for your analysis. Spark (give enough memory)
> can process large amounts of data quickly.
>
> On May 15, 2014, at 9:52 AM, Laeeq Ahmed  wrote:
>
> Hi,
>
> I have data in a file. Can I read it as Stream in spark? I know it seems
> odd to read file as stream but it has practical applications in real life
> if I can read it as stream. It there any other tools which can give this
> file as stream to Spark or I have to make batches manually which is not
> what I want. Its a coloumn of a million values.
>
> Regards,
> Laeeq
>
>
>


Historical Data as Stream

2014-05-16 Thread Laeeq Ahmed
Hi,

I have data in a file. Can I read it as Stream in spark? I know it seems odd to 
read file as stream but it has practical applications in real life if I can 
read it as stream. It there any other tools which can give this file as stream 
to Spark or I have to make batches manually which is not what I want. Its a 
coloumn of a million values.

Regards,
Laeeq

Re: Historical Data as Stream

2014-05-16 Thread Soumya Simanta
File is just a steam with a fixed length. Usually streams don't end but in this 
case it would. 

On the other hand if you real your file as a steam may not be able to use the 
entire data in the file for your analysis. Spark (give enough memory) can 
process large amounts of data quickly. 

> On May 15, 2014, at 9:52 AM, Laeeq Ahmed  wrote:
> 
> Hi,
> 
> I have data in a file. Can I read it as Stream in spark? I know it seems odd 
> to read file as stream but it has practical applications in real life if I 
> can read it as stream. It there any other tools which can give this file as 
> stream to Spark or I have to make batches manually which is not what I want. 
> Its a coloumn of a million values.
> 
> Regards,
> Laeeq
>