Hi,

I've implemented Twitter streaming as in the code given at the bottom of
email. It finds some tweets based on the hashtags I'm following. However,
it seems that a large amount of tweets is missing. I've tried to post some
tweets that I'm following in the application, and none of them was received
in application. I also checked some hashtags (e.g. #android) on Twitter
using Live and I could see that almost each second something was posted
with that hashtag, and my application received only 3-4 posts in one minute.

I didn't have this problem in earlier non-spark version of application
which used twitter4j to access user stream API. I guess this is some
trending stream, but I couldn't find anything that explains which Twitter
API is used in Spark Twitter Streaming and how to create stream that will
access everything posted on the Twitter.

I hope somebody could explain what is the problem and how to solve this.

Thanks,
Zoran


 def initializeStreaming(){
>    val config = getTwitterConfigurationBuilder.build()
>    val auth: Option[twitter4j.auth.Authorization] = Some(new
> twitter4j.auth.OAuthAuthorization(config))
>    val stream:DStream[Status]  = TwitterUtils.createStream(ssc, auth)
>    val filtered_statuses = stream.transform(rdd =>{
>     val filtered = rdd.filter(status =>{
>     var found = false
>         for(tag <- hashTagsList){
>           if(status.getText.toLowerCase.contains(tag)) {
>             found = true
>             }
>         }
>         found
>       })
>       filtered
>     })
>     filtered_statuses.foreachRDD(rdd => {
>       rdd.collect.foreach(t => {
>         println(t)
>       })
>    })
>     ssc.start()
>   }
>

Reply via email to