A RDD[Double] is an abstraction for a large collection of doubles, possibly distributed across multiple nodes. The DoubleRDDFunctions are there for performing mean and variance calculations across this distributed dataset.
In contrast, a Vector is not distributed and fits on your local machine. You would be better off computing these quantities on the Vector directly (see mllib.clustering.GaussianMixture#vectorMean for an example of how to compute the mean of a vector). On Tue, Jul 7, 2015 at 8:26 PM, 诺铁 <noty...@gmail.com> wrote: > hi, > > there are some useful functions in DoubleRDDFunctions, which I can use if > I have RDD[Double], eg, mean, variance. > > Vector doesn't have such methods, how can I convert Vector to RDD[Double], > or maybe better if I can call mean directly on a Vector? >