Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/14937 @srowen Yes, I'm working on this. You can see the performance test result in the PR description. We can found that the optimization k-means can get performance improvements about 2 ~ 4 times by using native BLAS level 3 matrix-matrix multiplications for dense input. However, we saw performance degradation for sparse input. For example, the new implementation spent almost twice time as much as the old one when training k-means model on the famous mnist data set. In the view of the current performance test result, I think we should only make this optimization for dense input and let sparse input still run the old code. I have sent the performance test result to @mengxr and waiting for his opinion. I'm also appreciate your thoughts and suggestions. Thanks!
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org