Hi,

I tried out the streaming program on the Spark training web page. I created
a Twitter app as per the instructions (pointing to http://www.twitter.com).
When I run the program, my credentials get printed out correctly but
thereafter, my program just keeps waiting. It does not print out the hashtag
count etc.  My code appears below (essentially same as what is on the
training web page). I would like to know why I am not able to get a
continuous stream and the hashtag count.

thanks

   // relevant code snippet 

   
TutorialHelper.configureTwitterCredentials(apiKey,apiSecret,accessToken,accessTokenSecret)

     val ssc = new StreamingContext(new SparkConf(), Seconds(1))
     val tweets = TwitterUtils.createStream(ssc, None)
     val statuses = tweets.map(status => status.getText())
     statuses.print()

     ssc.checkpoint(checkpointDir)

     val words = statuses.flatMap(status => status.split(" "))
     val hashtags = words.filter(word => word.startsWith("#"))
     hashtags.print()

     val counts = hashtags.map(tag => (tag, 1))
                          .reduceByKeyAndWindow(_ + _, _ - _, Seconds(60 *
5), Seconds(1))
     counts.print()

     val sortedCounts = counts.map { case(tag, count) => (count, tag) }
                         .transform(rdd => rdd.sortByKey(false))
     sortedCounts.foreach(rdd =>
                 println("\nTop 10 hashtags:\n" +
rdd.take(10).mkString("\n")))

     ssc.start()
     ssc.awaitTermination()

//end code snippet 




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Streaming-training-Spark-Summit-2014-tp9465.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply via email to