[ 
https://issues.apache.org/jira/browse/SPARK-5016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14321538#comment-14321538
 ] 

Travis Galoppo commented on SPARK-5016:
---------------------------------------

@mechcoder I may well be missing something simple here... but the sums for each 
cluster are not independent... you need the sums of the likelihoods from each 
to compute the partial assignments (see 
https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/mllib/clustering/GaussianMixture.scala#L217)...
 so it seems to me there would be an additional communication step involved in 
this.

Again, I may be missing something simple.


> GaussianMixtureEM should distribute matrix inverse for large numFeatures, k
> ---------------------------------------------------------------------------
>
>                 Key: SPARK-5016
>                 URL: https://issues.apache.org/jira/browse/SPARK-5016
>             Project: Spark
>          Issue Type: Improvement
>          Components: MLlib
>    Affects Versions: 1.2.0
>            Reporter: Joseph K. Bradley
>
> If numFeatures or k are large, GMM EM should distribute the matrix inverse 
> computation for Gaussian initialization.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to