GitHub user fjiang6 opened a pull request: https://github.com/apache/spark/pull/4254
[SPARK-4259][MLlib]: Add Power Iteration Clustering Algorithm with Gaussian Similarity Function Add single pseudo-eigenvector PIC Including documentations, one property file and updated pom.xml with the following codes: mllib/src/main/scala/org/apache/spark/mllib/clustering/PIClustering.scala mllib/src/test/scala/org/apache/spark/mllib/clustering/PIClusteringSuite.scala You can merge this pull request into a Git repository by running: $ git pull https://github.com/Huawei-Spark/spark PIC Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/4254.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4254 ---- commit a3c5fbe3451b665968d503fa4ee52f1f6118252a Author: Jiang Fan <fjia...@gmail.com> Date: 2015-01-22T21:52:52Z Adding Power Iteration Clustering commit d5aae2032c08d097ed3c6cd61ed2612a55a619df Author: Jiang Fan <fjia...@gmail.com> Date: 2015-01-22T21:57:35Z Adding Power Iteration Clustering and Suite test commit 3fd5bc895f1594c57a182c31e010966affb47325 Author: sboeschhuawei <stephen.boe...@huawei.com> Date: 2015-01-23T00:17:57Z PIClustering is running in new branch (up to the pseudo-eigenvector convergence step) commit 0ef163f89ed82ed72967b51330e16ac3cf5759be Author: sboeschhuawei <stephen.boe...@huawei.com> Date: 2015-01-23T04:20:47Z Added ConcentricCircles data generation and KMeans clustering commit 32a90dc5570ea02ee25b80c4440293581416209c Author: sboeschhuawei <stephen.boe...@huawei.com> Date: 2015-01-23T16:48:00Z Update circles test data values commit 0700335d7b4fe9132046f034a67eb3405cd20953 Author: sboeschhuawei <stephen.boe...@huawei.com> Date: 2015-01-23T22:30:53Z First end to end working version: but has bad performance issue commit e5df2b88c3668ecc4bc0cd25cde10dd033b9f72f Author: sboeschhuawei <stephen.boe...@huawei.com> Date: 2015-01-24T04:20:32Z First end to end working PIC commit 929426339d9934d61878880b2182bc5e18acee6c Author: sboeschhuawei <stephen.boe...@huawei.com> Date: 2015-01-25T11:00:07Z Added visualization/plotting of input/output data commit a2b1e5720266393a1813f0abe43c3709ebf46268 Author: sboeschhuawei <stephen.boe...@huawei.com> Date: 2015-01-25T11:21:43Z Revert inadvertent update to KMeans commit b7dbcbe56767a8609314a20f24e907c426e827af Author: sboeschhuawei <stephen.boe...@huawei.com> Date: 2015-01-26T00:03:46Z Added axes and combined into single plot for matplotlib commit f656c349b059a7df1c6415e69c2010873ba4d2d4 Author: sboeschhuawei <stephen.boe...@huawei.com> Date: 2015-01-26T00:04:10Z Added iris dataset commit a112f38d0476cee2bb5aa49311ce98b800141f8e Author: sboeschhuawei <stephen.boe...@huawei.com> Date: 2015-01-26T08:42:05Z Added graphx main and test jars as dependencies to mllib/pom.xml commit ace9749338c7454d17839dcf98ed75b131a21537 Author: Fan Jiang <fanjiang...@huawei.com> Date: 2015-01-26T18:27:50Z Update PIClustering.scala commit b29c0dbf081d8baa30a3a83b57492bf92b2f4b6a Author: Fan Jiang <fanjiang...@huawei.com> Date: 2015-01-26T18:57:04Z Update PIClustering.scala commit bea48eaa0cca25695c283616d86235227357980c Author: sboeschhuawei <stephen.boe...@huawei.com> Date: 2015-01-27T00:58:57Z Converted custom Linear Algebra datatypes/routines to use Breeze. commit 90e7fa4b58b6d12f6b04dab3bf5f0a9d50f8d330 Author: sboeschhuawei <stephen.boe...@huawei.com> Date: 2015-01-28T02:04:05Z Converted from custom Linalg routines to Breeze: added JavaDoc comments; added Markdown documentation commit be659e31f5d9b1d35561ee43620f36d26732a950 Author: sboeschhuawei <stephen.boe...@huawei.com> Date: 2015-01-28T02:06:53Z Added mllib specific log4j commit 060e6bf8d45a211a6b71e2cba8e4bf2b14b9e72a Author: sboeschhuawei <stephen.boe...@huawei.com> Date: 2015-01-28T06:49:12Z Added link to PIC doc from the main clustering md doc commit 24f438e9c72fcc77691fe5d70f01c1bb577ee874 Author: sboeschhuawei <stephen.boe...@huawei.com> Date: 2015-01-28T06:50:29Z fixed incorrect markdown in clustering doc commit 88aacc8fa8aa955be2ec81caf001897b2bc91625 Author: sboeschhuawei <stephen.boe...@huawei.com> Date: 2015-01-28T19:48:51Z Add assert to testcase on cluster sizes commit 43ab10be1c634f88d08f666df71ff15427e8a3d2 Author: sboeschhuawei <stephen.boe...@huawei.com> Date: 2015-01-28T19:55:09Z Change last two println's to log4j logger commit 218a49d4e74b24bebf94033440904ca7411a28f0 Author: sboeschhuawei <stephen.boe...@huawei.com> Date: 2015-01-28T20:38:04Z Applied Xiangrui's comments - especially removing RDD/PICLinalg classes and making noncritical methods private commit 1c3a62ea8d45609e22bf2394a73930b1a334422d Author: sboeschhuawei <stephen.boe...@huawei.com> Date: 2015-01-28T21:23:52Z removed matplot.py and reordered all private methods to bottom of PIC commit 121e4d5fc0a0ab61a211fc71fea7a74775feb763 Author: sboeschhuawei <stephen.boe...@huawei.com> Date: 2015-01-28T21:33:29Z Remove unused testing data files ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org