[ 
https://issues.apache.org/jira/browse/MAHOUT-1206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13816858#comment-13816858
 ] 

Chisomo Sakala commented on MAHOUT-1206:
----------------------------------------

I'm really excited about this prospect.

The paper <http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6253489> 
talks about how to implement MapReduce for DBScan (DBSCAN-MR). I have a pdf 
copy and can email it to anybody interested in viewing it. 

I emailed the author of that paper to find out if they'd be willing to 
contribute their coded implementation of DBSCAN-MR to Mahout, but  I haven't 
yet gotten a response.

Here is another paper discussing parellelization of DBSCAN.
<http://conferences.computer.org/sc/2012/papers/1000a053.pdf>







> Add density-based clustering algorithms to mahout
> -------------------------------------------------
>
>                 Key: MAHOUT-1206
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1206
>             Project: Mahout
>          Issue Type: Improvement
>            Reporter: Yexi Jiang
>              Labels: clustering
>             Fix For: Backlog
>
>
> The clustering algorithms (kmeans, fuzzy kmeans, dirichlet clustering, and 
> spectral cluster) clustering data by assuming that the data can be clustered 
> into the regular hyper sphere or ellipsoid. However, in practical, not all 
> the data can be clustered in this way. 
> To enable the data to be clustered in arbitrary shapes, clustering algorithms 
> like DBSCAN, BIRCH, CLARANCE 
> (http://en.wikipedia.org/wiki/Cluster_analysis#Density-based_clustering) are 
> proposed.
> It is better that we can implement one or some of these clustering algorithm 
> to enrich the clustering library. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to