Hello, here's a simple program that demonstrates my problem:
Is "keyavg = rdd.values().reduce(sum) / rdd.count()" inside stats calculated one time per partition or it's just once? I guess another way to ask the same question is DStream.transform() is called on the driver node or not? What would be an alternative way to do this two step computation without calculating the average many times? I guess I could do it in a foreachRDD() block but it doesn't seem appropriate given that this is more of a a transform than an action. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/does-dstream-transform-run-on-the-driver-node-tp24176.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org