How are you preparing the vectors? You will get the cluster members if these are named vectors. you can prepare named vectors from a sequence file using $MAHOUT_HOME/bin/mahout seq2sparse
add the parameter --namedVector to the command to create named vectors, the same clusterdump command will then yield the members of the clusters. Hope this helped. On Wed, Apr 6, 2011 at 9:23 AM, Madhusudan Joshi <[email protected] > wrote: > The command I used to cluster dump is > > mahout clusterdump -s mytest/kmeans/clusters-1 -p > mytest/kmeans/clusteredPoints -d mytest/seqdir-sparse/dictionary.file-0 -dt > sequencefile -n 20 -o Desktop/ClusterDump/Kmeans/cl1.txt > > I tried the reuters example and then clustered using my sample files. The > output of my sample files is > > CL-0{n=2 c=[article:3.009, first:3.279, third:3.279] r=[first:3.279, > third:3.279]} > Top Terms: > third => 3.2787654399871826 > first => 3.2787654399871826 > article => 3.0087521076202393 > Weight: Point: > 1.0: [article:3.009, first:6.558] > 1.0: [article:3.009, third:6.558] > VL-1{n=1 c=[article:3.009, second:6.558] r=[article:0.000, first:0.000, > fourth:0.000, second:0.000, third:0.000]} > Top Terms: > second => 6.557530879974365 > article => 3.0087521076202393 > Weight: Point: > 1.0: [article:3.009, second:6.558] > VL-3{n=1 c=[article:3.009, fourth:6.558] r=[article:0.000, first:0.000, > fourth:0.000, second:0.000, third:0.000]} > Top Terms: > fourth => 6.557530879974365 > article => 3.0087521076202393 > Weight: Point: > 1.0: [article:3.009, fourth:6.558] > > The output showed the number of documents present in the cluster but did > not > mention which documents. I need to be able to check which documents are > present in any given clusters. > > On Tue, Apr 5, 2011 at 11:34 PM, Jeff Eastman <[email protected]> wrote: > > > You are going to have to be much more explicit in terms of what command > > line invocations you did and what results you got in order for anybody to > be > > able help you much here. Have you tried the clustering examples in the > wiki? > > > > -----Original Message----- > > From: Madhusudan Joshi [mailto:[email protected]] > > Sent: Monday, April 04, 2011 10:23 PM > > To: [email protected] > > Subject: Check the input files present in cluster > > > > Hi, > > > > I am new to mahout and trying out clustering. I created a cluster using > > kmeans in bash. I want to know which files are present in a given > clusters. > > I tried looking for it in cluster dumper but didn't find the required > > solution. Can anyone help me with this? > > > > Thanks. > > > > -- > > Everything we hear is an opinion, not a fact. > > Everything we see is perspective, not the truth. > > > > > > -- > Everything we hear is an opinion, not a fact. > Everything we see is perspective, not the truth. >
