[ https://issues.apache.org/jira/browse/MAHOUT-1959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Trevor Grant updated MAHOUT-1959: --------------------------------- Fix Version/s: classic-15.0 > BallKMeans.iterativeAssignment can set wrong weights. > ----------------------------------------------------- > > Key: MAHOUT-1959 > URL: https://issues.apache.org/jira/browse/MAHOUT-1959 > Project: Mahout > Issue Type: Bug > Reporter: Hao Zhong > Assignee: Shashanka Balakuntala Srinivasa > Priority: Major > Fix For: classic-15.0 > > > I notice that the BallKMeans.iterativeAssignment method uses the following > code to calculate weights: > {code:title=BallKMeans.java|borderStyle=solid} > for (WeightedVector datapoint : datapoints) { > Centroid closestCentroid = (Centroid) > centroids.searchFirst(datapoint, false).getValue(); > closestCentroid.setWeight(closestCentroid.getWeight() + > datapoint.getWeight()); > } > {code} > In MAHOUT-1237, the buggy code is the same way to calculate the weight: > {code:title=ClusteringUtils.java|borderStyle=solid} > for (Vector vector : datapoints) { > Centroid closest = (Centroid) centroids.searchFirst(vector, > false).getValue(); > totalCost += closest.getWeight(); > } > {code} > The fixed code is as follow: > {code:title=ClusteringUtils.java|borderStyle=solid} > for (Vector vector : datapoints) { > totalCost += centroids.searchFirst(vector, false).getWeight(); > } > {code} > I am not quite sure whether BallKMeans.iterativeAssignment sets the right > weights. Please check it. -- This message was sent by Atlassian Jira (v8.20.10#820010)