[jira] Commented: (MAHOUT-160) ClusterDumper utility to output all the clusters in all sequence files and points

Grant Ingersoll (JIRA) Thu, 06 Aug 2009 08:06:52 -0700

    [ 
https://issues.apache.org/jira/browse/MAHOUT-160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740090#action_12740090
 ]


Grant Ingersoll commented on MAHOUT-160:
----------------------------------------

Committed revision 801661.

> ClusterDumper utility to output all the clusters in all sequence files and 
> points
> ---------------------------------------------------------------------------------
>
>                 Key: MAHOUT-160
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-160
>             Project: Mahout
>          Issue Type: Improvement
>            Reporter: Shashikant Kore
>            Assignee: Grant Ingersoll
>         Attachments: mahout-160-dict.patch, mahout-160.patch
>
>
> The current ClusterDumper utility takes a sequence file and points file as 
> input and prints the cluster vector along with the points that belong to the 
> clusters in the sequence file. This utility doesn't produce correct results 
> in case there are multiple sequence files and points. 
> To avoid this problem, all the point to cluster mappings need to be read 
> first and then iterate on the sequence files.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAHOUT-160) ClusterDumper utility to output all the clusters in all sequence files and points

Reply via email to