.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Computing-mean-and-standard-deviation-by-key-tp11192p14062.html
Sent from the Apache Spark User List mailing list archive at Nabble.com
and std dev for Paired RDDs (key, value)?
Now I'm using an approach with ReduceByKey but want to make my code more
concise and readable.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Computing-mean-and-standard-deviation-by-key-tp11192p14062.html
[Double]]
.values.stats
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Computing-mean-and-standard-deviation-by-key-tp11192p14065.html
Sent from the Apache Spark User List mailing list archive at Nabble.com
]]
.values.stats
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Computing-mean-and-standard-deviation-by-key-tp11192p14065.html
Sent from the Apache Spark User List mailing list archive at Nabble.com
print(stddev: + stddev)
stddev
}
I hope that helps
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Computing-mean-and-standard-deviation-by-key-tp11192p11334.html
Sent from the Apache Spark User List mailing list archive
going down the wrong path?
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Computing-mean-and-standard-deviation-by-key-tp11192.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
The reason I want an RDD is because I'm assuming that iterating the
individual elements of an RDD on the driver of the cluster is much slower
than coming up with the mean and standard deviation using a
map-reduce-based algorithm.
I don't know the intimate details of Spark's implementation, but it
You're certainly not iterating on the driver. The Iterable you process
in your function is on the cluster and done in parallel.
On Fri, Aug 1, 2014 at 8:36 PM, Kristopher Kalish k...@kalish.net wrote:
The reason I want an RDD is because I'm assuming that iterating the
individual elements of an
?
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Computing-mean-and-standard-deviation-by-key-tp11192.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
iterable.foreach{ y =
sum = sum + y.foo
count = count + 1
}
val mean = sum/count;
// save mean to database...
}
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Computing-mean-and-standard-deviation-by-key-tp11192p11207.html
mean = sum/count;
// save mean to database...
}
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Computing-mean-and-standard-deviation-by-key-tp11192p11207.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Computing-mean-and-standard-deviation-by-key-tp11192p11214.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
12 matches
Mail list logo