Ok, got it , thanks.
On Thu, Jul 9, 2015 at 12:02 PM, prosp4300 wrote:
>
>
> Seems what Feynman mentioned is the source code instead of documentation,
> vectorMean is private, see
>
> https://github.com/apache/spark/blob/v1.3.0/mllib/src/main/scala/org/apache/spark/mllib/clustering/GaussianMixtu
Seems what Feynman mentioned is the source code instead of documentation,
vectorMean is private, see
https://github.com/apache/spark/blob/v1.3.0/mllib/src/main/scala/org/apache/spark/mllib/clustering/GaussianMixture.scala
At 2015-07-09 10:10:58, "诺铁" wrote:
thanks, I understand now.
but I c
thanks, I understand now.
but I can't find mllib.clustering.GaussianMixture#vectorMean , what
version of spark do you use?
On Thu, Jul 9, 2015 at 1:16 AM, Feynman Liang wrote:
> A RDD[Double] is an abstraction for a large collection of doubles,
> possibly distributed across multiple nodes. The
A RDD[Double] is an abstraction for a large collection of doubles, possibly
distributed across multiple nodes. The DoubleRDDFunctions are there for
performing mean and variance calculations across this distributed dataset.
In contrast, a Vector is not distributed and fits on your local machine.
Yo