[ https://issues.apache.org/jira/browse/MAHOUT-54?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jeremy Chow updated MAHOUT-54: ------------------------------ Attachment: canopykeams.patch original implementation > parallelize k-means sharing the predominance of canopies > -------------------------------------------------------- > > Key: MAHOUT-54 > URL: https://issues.apache.org/jira/browse/MAHOUT-54 > Project: Mahout > Issue Type: Improvement > Components: Clustering > Affects Versions: 0.1 > Environment: OS Independent > Reporter: Jeremy Chow > Fix For: 0.1 > > Attachments: canopykeams.patch > > > The implementation of mahout at present only using canopy algorithm creating > initial cluster centroids for k-means. It will calculate the distance from > each center to every point while iterating. But the most import improvement > of canopies is that needs only calculating the distance from each center to > a much smaller number of points which exists in the same canopy. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.