Sure, why don't you go ahead and post a patch?

Pallavi Palleti (JIRA) wrote:
[ https://issues.apache.org/jira/browse/MAHOUT-99?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12683312#action_12683312 ]
Pallavi Palleti commented on MAHOUT-99:
---------------------------------------

I have used KeyValueLineRecordReader internally for my code and forgot to 
revert back to SequenceFileReader. Will that be sufficient to add another patch 
on the latest code and modify only KMeansDriver to use SequenceFileReader? 
Kindly let me know.

Thanks
Pallavi

Improving speed of KMeans
-------------------------

                Key: MAHOUT-99
                URL: https://issues.apache.org/jira/browse/MAHOUT-99
            Project: Mahout
         Issue Type: Improvement
         Components: Clustering
           Reporter: Pallavi Palleti
           Assignee: Grant Ingersoll
            Fix For: 0.1

        Attachments: MAHOUT-99-1.patch, Mahout-99.patch, MAHOUT-99.patch


Improved the speed of KMeans by passing only cluster ID from mapper to reducer. 
Previously, whole Cluster Info as formatted s`tring was being sent.
Also removed the implicit assumption of Combiner runs only once approach and 
the code is modified accordingly so that it won't create a bug when combiner 
runs zero or more than once.


Attachment: PGP.sig
Description: PGP signature

Reply via email to