Implement clustering of massive-domain attributes
-------------------------------------------------

                 Key: MAHOUT-173
                 URL: https://issues.apache.org/jira/browse/MAHOUT-173
             Project: Mahout
          Issue Type: New Feature
          Components: Clustering
            Reporter: Matias Bjørling
            Priority: Trivial


Implement the Clustering algorithm described in "A Framework for Clustering 
Massive-Domain Data Streams" by Chary C. Aggarwal.

Steps: 

1. Implement baseline solution to compare solutions.
2. Figure out how to implement the loading of clustering by looking at the 
k-means implementation.
3. Implement Count-Min sketch algorithm for each cluster.
4. Find out how to give the user the power to choose the distance function for 
the input data ( Maybe already possible? )


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to