Sure, why don't you go ahead and post a patch?
Pallavi Palleti (JIRA) wrote:
[ https://issues.apache.org/jira/browse/MAHOUT-99?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12683312#action_12683312 ]Pallavi Palleti commented on MAHOUT-99: --------------------------------------- I have used KeyValueLineRecordReader internally for my code and forgot to revert back to SequenceFileReader. Will that be sufficient to add another patch on the latest code and modify only KMeansDriver to use SequenceFileReader? Kindly let me know. Thanks PallaviImproving speed of KMeans ------------------------- Key: MAHOUT-99 URL: https://issues.apache.org/jira/browse/MAHOUT-99 Project: Mahout Issue Type: Improvement Components: Clustering Reporter: Pallavi Palleti Assignee: Grant Ingersoll Fix For: 0.1 Attachments: MAHOUT-99-1.patch, Mahout-99.patch, MAHOUT-99.patch Improved the speed of KMeans by passing only cluster ID from mapper to reducer. Previously, whole Cluster Info as formatted s`tring was being sent. Also removed the implicit assumption of Combiner runs only once approach and the code is modified accordingly so that it won't create a bug when combiner runs zero or more than once.
PGP.sig
Description: PGP signature