groupBy for DStream

2014-11-11 Thread SK

Hi.

1) I dont see a groupBy() method for a DStream object. Not sure why that is
not supported. Currently I am using filter () to separate out the different
groups. I would like to know if there is a way to convert a DStream object
to a regular RDD so that I can apply the RDD methods like groupBy.


2) The count() method for a DStream object returns a DStream[Long] instead
of a simple Long (like RDD does). How can I extract the simple Long count
value? I tried dstream(0) but got a compilation error that it does not take
parameters. I also tried dstream[0], but that also resulted in a compilation
error. I am not able to use the head() or take(0) method for DStream either.

thanks



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/groupBy-for-DStream-tp18623.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: groupBy for DStream

2014-11-11 Thread Akhil Das
1. Use foreachRDD over the dstream and on the each rdd you can call the
groupBy()

2. DStream.count() Return a new DStream in which each RDD has a single
element generated by counting each RDD of this DStream.

Thanks
Best Regards

On Wed, Nov 12, 2014 at 2:49 AM, SK skrishna...@gmail.com wrote:


 Hi.

 1) I dont see a groupBy() method for a DStream object. Not sure why that is
 not supported. Currently I am using filter () to separate out the different
 groups. I would like to know if there is a way to convert a DStream object
 to a regular RDD so that I can apply the RDD methods like groupBy.


 2) The count() method for a DStream object returns a DStream[Long] instead
 of a simple Long (like RDD does). How can I extract the simple Long count
 value? I tried dstream(0) but got a compilation error that it does not take
 parameters. I also tried dstream[0], but that also resulted in a
 compilation
 error. I am not able to use the head() or take(0) method for DStream
 either.

 thanks



 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/groupBy-for-DStream-tp18623.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org