I don't think directy .aggregateByKey() can be done, because we will need
count of keys (for average). Maybe we can use .countByKey() which returns a
map and .foldByKey(0)(_+_) (or aggregateByKey()) which gives sum of values
per key. I myself ain't getting how to proceed.
Regards
On Fri, Oct 31,
Hi, everyone I have an RDD filled with data like (k1, v11)
(k1, v12) (k1, v13) (k2, v21) (k2, v22) (k2, v23)
...
I want to calculate the average and standard deviation of (v11, v12, v13)
and (v21, v22, v23) group by there keys for