[
https://issues.apache.org/jira/browse/MAHOUT-551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15171274#comment-15171274
]
sai commented on MAHOUT-551:
-----------------------------
Hi Andrew,thanks for responding, I was given a task to run kmeans on CSV input
file in mahout command line,would you pls direct me how to run and I tried
using the command org.apache.clustering. generic.kmeans.job but i got an
exception class not found,
My other doubt is like synthetic control example ,is that the only dataset on
which the code works or can we input any space delimited numerical input.
Thanks in advance
Sai
> Kmeans example with space delimited data
> ----------------------------------------
>
> Key: MAHOUT-551
> URL: https://issues.apache.org/jira/browse/MAHOUT-551
> Project: Mahout
> Issue Type: Improvement
> Components: Integration
> Affects Versions: 0.4
> Reporter: Djellel Eddine Difallah
> Assignee: Jeff Eastman
> Priority: Minor
> Fix For: 0.5
>
> Attachments: MAHOUT-551.patch, MAHOUT-551.patch
>
>
> The provided example for Kmeans clustering using the synthetic control data
> asks for t1 and t2 measures because it runs the Canopy Driver to determine
> the initial clusters. Kmeans originally requires a K variable to generate
> random centers from the input data. I propose to add another example in the
> package which will serve for any space delimited numerical input to cluster
> with Kmeans in its original form and not using Canopy. The modification is
> quite simple and is mostly based on the synthetic control Job.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)