Or combine both!  It is possible with Spark Streaming to combine streaming data 
and on HDFS. In the end it always depends what you want to do and when you need 

> On 03 Jun 2016, at 10:26, Mich Talebzadeh <mich.talebza...@gmail.com> wrote:
> I use twitter data with spark streaming to experiment with twitter data. 
> Basic stuff
>     val ssc = new StreamingContext(sparkConf, Seconds(2))
>     val tweets = TwitterUtils.createStream(ssc, None)
>     val statuses = tweets.map(status => status.getText())
>     statuses.print()
> Another alternative is to use Apache flume to get the twitter data and store 
> it as log files in hdfs.
> <image.png>
> I notice that these log files are stored as binary log files.
> I presume the log files can be read and converted to json through another 
> process or used with machine learning language.
> I know this question may not be directly relevant  but what are the main 
> approaches, one real time analysis of twitter using spark streaming and the 
> other store data in hdfs and use later.?
> Thanks
> Dr Mich Talebzadeh
> LinkedIn  
> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> http://talebzadehmich.wordpress.com

Reply via email to