[jira] [Commented] (MAHOUT-1177) GSOC 2013: Reform and simplify the clustering APIs

Nilesh Chakraborty (JIRA) Sat, 01 Feb 2014 16:54:41 -0800

    [ 
https://issues.apache.org/jira/browse/MAHOUT-1177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13888789#comment-13888789
 ]


Nilesh Chakraborty commented on MAHOUT-1177:
--------------------------------------------

This is a pretty important issue and I think it'd be awesome to implement, 
quite keen on it myself. Was any code written for this? Any progress made?

> GSOC 2013: Reform and simplify the clustering APIs
> --------------------------------------------------
>
>                 Key: MAHOUT-1177
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1177
>             Project: Mahout
>          Issue Type: Improvement
>            Reporter: Dan Filimon
>              Labels: gsoc2013, mentor
>             Fix For: Backlog
>
>
> Clustering is one of the most used features in Mahout and has many 
> applications [http://en.wikipedia.org/wiki/Cluster_analysis#Applications].
> We have of lots clustering algorithms. There's:
> - basic k-means
> - canopy clustering
> - Dirichlet clustering
> - Fuzzy k-means
> - Spectral k-means
> - Streaming k-means [coming soon]
> We want to make them easier to use by updating the APIs and make sure they 
> all work in the same way have consistent inputs, outputs, diagnostics and 
> documentation.
> This is a great way to gain an in-depth understanding of clustering 
> algorithms, familiarize yourself with Hadoop, Mahout clustering and good 
> software engineering principles.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (MAHOUT-1177) GSOC 2013: Reform and simplify the clustering APIs

Reply via email to