[ https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13679170#comment-13679170 ]
Grant Ingersoll commented on MAHOUT-1030: ----------------------------------------- Pat, do you have a patch for this that demonstrates what you are suggesting so that we can compare? > Regression: Clustered Points Should be WeightedPropertyVectorWritable not > WeightedVectorWritable > ------------------------------------------------------------------------------------------------ > > Key: MAHOUT-1030 > URL: https://issues.apache.org/jira/browse/MAHOUT-1030 > Project: Mahout > Issue Type: Bug > Components: Clustering, Integration > Affects Versions: 0.7 > Reporter: Jeff Eastman > Assignee: Suneel Marthi > Fix For: 0.8 > > Attachments: MAHOUT-1030.patch, MAHOUT-1030.patch, MAHOUT-1030.patch > > > Looks like this won't make it into this build. Pretty widespread impact on > code and tests and I don't know which properties were implemented in the old > version. I will create a JIRA and post my interim results. > On 6/8/12 12:21 PM, Jeff Eastman wrote: > > That's a reversion that evidently got in when the new > > ClusterClassificationDriver was introduced. It should be a pretty easy fix > > and I will see if I can make the change before Paritosh cuts the release > > bits tonight. > > > > On 6/7/12 1:00 PM, Pat Ferrel wrote: > >> It appears that in kmeans the clusteredPoints are now written as > >> WeightedVectorWritable where in mahout 0.6 they were > >> WeightedPropertyVectorWritable? This means that the distance from the > >> centroid is no longer stored here? Why? I hope I'm wrong because that is > >> not a welcome change. How is one to order clustered docs by distance from > >> cluster centroid? > >> > >> I'm sure I could calculate the distance but that would mean looking up the > >> centroid for the cluster id given in the above WeightedVectorWritable, > >> which means iterating through all the clusters for each clustered doc. In > >> my case the number of clusters could be fairly large. > >> > >> Am I missing something? > >> > >> > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira