[ https://issues.apache.org/jira/browse/MAHOUT-297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12861194#action_12861194 ]
Jeff Eastman edited comment on MAHOUT-297 at 4/26/10 9:15 PM: -------------------------------------------------------------- I don't understand why the constructors for Canopy and KMeans Cluster were modified to override the given center vector types, as in: {noformat} public Canopy(Vector point, int canopyId) { this.setId(canopyId); - this.setCenter(point.clone()); - this.setPointTotal(point.clone()); + this.setCenter(new RandomAccessSparseVector(point.clone())); + this.setPointTotal(getCenter().clone()); this.setNumPoints(1); } {noformat} I can appreciate it might be a performance fix in some situations but forcing the center and total to be another type than that of the argument strikes me as bad practice. With input vectors of arbitrary type, shouldn't the clusters honor the contract to do their math over that type? I'm -1 on this part of the patch. was (Author: jeastman): I don't understand why the constructors for Canopy and KMeans Cluster were modified to override the given center vector types, as in: public Canopy(Vector point, int canopyId) { this.setId(canopyId); - this.setCenter(point.clone()); - this.setPointTotal(point.clone()); + this.setCenter(new RandomAccessSparseVector(point.clone())); + this.setPointTotal(getCenter().clone()); this.setNumPoints(1); } I can appreciate it might be a performance fix in some situations but forcing the center and total to be another type than that of the argument strikes me as bad practice. With input vectors of arbitrary type, shouldn't the clusters honor the contract to do their math over that type? I'm -1 on this part of the patch. > Canopy and Kmeans clustering slows down on using SeqAccVector for center > ------------------------------------------------------------------------ > > Key: MAHOUT-297 > URL: https://issues.apache.org/jira/browse/MAHOUT-297 > Project: Mahout > Issue Type: Improvement > Components: Clustering > Affects Versions: 0.4 > Reporter: Robin Anil > Assignee: Robin Anil > Fix For: 0.4 > > Attachments: MAHOUT-297.patch, MAHOUT-297.patch, MAHOUT-297.patch, > MAHOUT-297.patch, MAHOUT-297.patch > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.