[
https://issues.apache.org/jira/browse/MAHOUT-1177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13644120#comment-13644120
]
Ted Dunning commented on MAHOUT-1177:
-------------------------------------
Before you do much planning you should organize two groups of people:
a) your co-contributors
b) the existing clustering stake-holders. These include Robin Anil and Jeff
Eastman because of their authorship of the previous interface as well as Dan
Filimon and myself because of our authorship of the new streaming k-means
clustering code.
The first step should be to listen to what people think before you decide on
what to do.
The things you should deliver first include:
1) a survey of the current interface and the problems with it
2) a survey of the algorithms to be supported in the new unified interface
This should spark considerable discussion on the mailing list.
Only then should you consider making a proposal for changes.
> GSOC 2013: Reform and simplify the clustering APIs
> --------------------------------------------------
>
> Key: MAHOUT-1177
> URL: https://issues.apache.org/jira/browse/MAHOUT-1177
> Project: Mahout
> Issue Type: Improvement
> Reporter: Dan Filimon
> Labels: gsoc2013, mentor
>
> Clustering is one of the most used features in Mahout and has many
> applications [http://en.wikipedia.org/wiki/Cluster_analysis#Applications].
> We have of lots clustering algorithms. There's:
> - basic k-means
> - canopy clustering
> - Dirichlet clustering
> - Fuzzy k-means
> - Spectral k-means
> - Streaming k-means [coming soon]
> We want to make them easier to use by updating the APIs and make sure they
> all work in the same way have consistent inputs, outputs, diagnostics and
> documentation.
> This is a great way to gain an in-depth understanding of clustering
> algorithms, familiarize yourself with Hadoop, Mahout clustering and good
> software engineering principles.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira