Make cluster top terms code more reusable -----------------------------------------
Key: MAHOUT-845 URL: https://issues.apache.org/jira/browse/MAHOUT-845 Project: Mahout Issue Type: Improvement Components: Clustering Reporter: Frank Scholten Priority: Minor When working with Mahout text clustering I find that I keep writing code similar to the contents of public static String getTopFeatures(Cluster cluster, String[] dictionary, int numTerms) in ClusterDumper in order to determine cluster labels. I think it would be useful if (parts of) this code are added to the cluster or vector API so that you could do something like Cluster cluster = ... // get the cluster from seq file iterable String clusterLabel = cluster.getTopTerms(1, dictionary); // Do something with the label I think this would make it easier to export and post-process clustering results, like indexing or storing them elsewhere. Thoughts? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira