1. Use foreachRDD over the dstream and on the each rdd you can call the
groupBy()
2. DStream.count() Return a new DStream in which each RDD has a single
element generated by counting each RDD of this DStream.
Thanks
Best Regards
On Wed, Nov 12, 2014 at 2:49 AM, SK skrishna...@gmail.com wrote:
Hi.
1) I dont see a groupBy() method for a DStream object. Not sure why that is
not supported. Currently I am using filter () to separate out the different
groups. I would like to know if there is a way to convert a DStream object
to a regular RDD so that I can apply the RDD methods like groupBy.
2) The count() method for a DStream object returns a DStream[Long] instead
of a simple Long (like RDD does). How can I extract the simple Long count
value? I tried dstream(0) but got a compilation error that it does not take
parameters. I also tried dstream[0], but that also resulted in a
compilation
error. I am not able to use the head() or take(0) method for DStream
either.
thanks
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/groupBy-for-DStream-tp18623.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org