[ https://issues.apache.org/jira/browse/SPARK-5016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14321538#comment-14321538 ]
Travis Galoppo commented on SPARK-5016: --------------------------------------- @mechcoder I may well be missing something simple here... but the sums for each cluster are not independent... you need the sums of the likelihoods from each to compute the partial assignments (see https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/mllib/clustering/GaussianMixture.scala#L217)... so it seems to me there would be an additional communication step involved in this. Again, I may be missing something simple. > GaussianMixtureEM should distribute matrix inverse for large numFeatures, k > --------------------------------------------------------------------------- > > Key: SPARK-5016 > URL: https://issues.apache.org/jira/browse/SPARK-5016 > Project: Spark > Issue Type: Improvement > Components: MLlib > Affects Versions: 1.2.0 > Reporter: Joseph K. Bradley > > If numFeatures or k are large, GMM EM should distribute the matrix inverse > computation for Gaussian initialization. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org