[ https://issues.apache.org/jira/browse/MAHOUT-1206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13816858#comment-13816858 ]
Chisomo Sakala commented on MAHOUT-1206: ---------------------------------------- I'm really excited about this prospect. The paper <http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6253489> talks about how to implement MapReduce for DBScan (DBSCAN-MR). I have a pdf copy and can email it to anybody interested in viewing it. I emailed the author of that paper to find out if they'd be willing to contribute their coded implementation of DBSCAN-MR to Mahout, but I haven't yet gotten a response. Here is another paper discussing parellelization of DBSCAN. <http://conferences.computer.org/sc/2012/papers/1000a053.pdf> > Add density-based clustering algorithms to mahout > ------------------------------------------------- > > Key: MAHOUT-1206 > URL: https://issues.apache.org/jira/browse/MAHOUT-1206 > Project: Mahout > Issue Type: Improvement > Reporter: Yexi Jiang > Labels: clustering > Fix For: Backlog > > > The clustering algorithms (kmeans, fuzzy kmeans, dirichlet clustering, and > spectral cluster) clustering data by assuming that the data can be clustered > into the regular hyper sphere or ellipsoid. However, in practical, not all > the data can be clustered in this way. > To enable the data to be clustered in arbitrary shapes, clustering algorithms > like DBSCAN, BIRCH, CLARANCE > (http://en.wikipedia.org/wiki/Cluster_analysis#Density-based_clustering) are > proposed. > It is better that we can implement one or some of these clustering algorithm > to enrich the clustering library. -- This message was sent by Atlassian JIRA (v6.1#6144)