Re: Spark Streaming RDD transformation

2014-06-26 Thread Sean Owen
If you want to transform an RDD to a Map, I assume you have an RDD of pairs. The method collectAsMap() creates a Map from the RDD in this case. Do you mean that you want to update a Map object using data in each RDD? You would use foreachRDD() in that case. Then you can use RDD.foreach to do

Re: Spark Streaming RDD transformation

2014-06-26 Thread Bill Jay
Thanks, Sean! I am currently using foreachRDD to update the global map using data in each RDD. The reason I want to return a map as RDD instead of just updating the map is that RDD provides many handy methods for output. For example, I want to save the global map into files in HDFS for each batch