Hello, here's a simple program that demonstrates my problem:


Is "keyavg = rdd.values().reduce(sum) / rdd.count()" inside stats calculated
one time per partition or it's just once? I guess another way to ask the
same question is DStream.transform() is called on the driver node or not?

What would be an alternative way to do this two step computation without
calculating the average many times? I guess I could do it in a foreachRDD()
block but it doesn't seem appropriate given that this is more of a a
transform than an action.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/does-dstream-transform-run-on-the-driver-node-tp24176.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to