[ 
https://issues.apache.org/jira/browse/SPARK-13226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15136118#comment-15136118
 ] 

holdenk commented on SPARK-13226:
---------------------------------

Further poking shows that they did both depending on the kmeans algorithm used, 
so will use the tolerance for tolerance based.

> MLLib PowerIteration Clustering depends on deprecated KMeans setRuns API
> ------------------------------------------------------------------------
>
>                 Key: SPARK-13226
>                 URL: https://issues.apache.org/jira/browse/SPARK-13226
>             Project: Spark
>          Issue Type: Improvement
>          Components: MLlib
>            Reporter: holdenk
>            Priority: Trivial
>
> The current MLLib PowerIteration clustering implementation sets the number of 
> runs inside of the kmeans call to 5 (apparently arbitrary). This should 
> likely be replaced with a specific tolerance.
> The reference implementation also appears to use a tolerance, so this would 
> also be moving closer to the reference implementation ( 
> http://www.cs.cmu.edu/~wcohen/ )



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to