Hi all,

I want to calculate mean and SD for each RDD. I used the followoing code for 
mean and now I have to use this mean for SD, but not sure how to use get these 
means for each RDD from the DStream, so I can use it for SD. My sample files is 
as

1
2
3
4
5


The code is as

 val individualpoints = ssc.textFileStream(args(1))
val sumandcount = individualpoints.map(x => (x.toDouble, 1)).reduce{ (a, b) => 
(a._1 + b._1, a._2 + b._2) }

sumandcount.foreachRDD(rdd => {
        rdd.collect.foreach({
                case (sum, n) => {
                        println("********************************************")
                        println("The Σ(x) is --> " + sum + " with N --> " + n)
                        mean = sum/n
                        println("The μ is -->" + mean)
                        println("*********************************************")
                                }
                        })
                })

Regards,
Laeeq

Reply via email to