[ https://issues.apache.org/jira/browse/SPARK-2430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14122273#comment-14122273 ]
RJ Nowling commented on SPARK-2430: ----------------------------------- Hi Yu, The community had suggested looking into scikit-learn's API so that is a good idea. I am hesitant to make backwards-incompatible API changes, however, until we know the new API will be stable for a long time. I think it would be best to implement a few more clustering algorithms to get a clear idea of what is similar vs different before making a new API. May I suggest you work on SPARK-2966 / SPARK-2429 first? RJ > Standarized Clustering Algorithm API and Framework > -------------------------------------------------- > > Key: SPARK-2430 > URL: https://issues.apache.org/jira/browse/SPARK-2430 > Project: Spark > Issue Type: New Feature > Components: MLlib > Reporter: RJ Nowling > Priority: Minor > > Recently, there has been a chorus of voices on the mailing lists about adding > new clustering algorithms to MLlib. To support these additions, we should > develop a common framework and API to reduce code duplication and keep the > APIs consistent. > At the same time, we can also expand the current API to incorporate requested > features such as arbitrary distance metrics or pre-computed distance matrices. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org