Multiple operations on same DStream in Spark Streaming

foobar Sat, 25 Jul 2015 15:08:25 -0700

Hi I'm working with Spark Streaming using scala, and trying to figure out the
following problem. In my DStream[(int, int)], each record is an int pair
tuple. For each batch, I would like to filter out all records with first
integer below average of first integer in this batch, and for all records
with first integer above average of first integer in the batch, compute the
average of second integers in such records. What's the best practice to
implement this? I tried this but kept getting the object not serializable
exception because it's hard to share variables (such as average of first int
in the batch) between workers and driver. Any suggestions? Thanks!




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Multiple-operations-on-same-DStream-in-Spark-Streaming-tp23995.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Multiple operations on same DStream in Spark Streaming

Reply via email to