Hi Sean, I tried passing the file too. But doing so gives me the following error:
SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/praneet/.m2/repository/org/slf4j/slf4j-log4j12/1.6.1/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/praneet/.m2/repository/org/slf4j/slf4j-jcl/1.6.1/slf4j-jcl-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. 12/01/07 12:31:57 INFO dirichlet.DirichletDriver: Iteration 1 12/01/07 12:31:57 INFO dirichlet.DirichletDriver: Iteration 2 12/01/07 12:31:57 INFO dirichlet.DirichletDriver: Iteration 3 12/01/07 12:31:58 INFO dirichlet.DirichletDriver: Iteration 4 12/01/07 12:31:58 INFO dirichlet.DirichletDriver: Iteration 5 12/01/07 12:31:58 INFO dirichlet.DirichletDriver: Iteration 6 12/01/07 12:31:58 INFO dirichlet.DirichletDriver: Iteration 7 12/01/07 12:31:58 INFO dirichlet.DirichletDriver: Iteration 8 12/01/07 12:31:58 INFO dirichlet.DirichletDriver: Iteration 9 12/01/07 12:31:58 INFO dirichlet.DirichletDriver: Iteration 10 java.lang.IllegalStateException: file:/home/praneet/Eclipse-Output/output/clusters-10-final/clusters-10 at org.apache.mahout.common.iterator.sequencefile.SequenceFileDirValueIterator$1.apply(SequenceFileDirValueIterator.java:82) at org.apache.mahout.common.iterator.sequencefile.SequenceFileDirValueIterator$1.apply(SequenceFileDirValueIterator.java:1) at com.google.common.collect.Iterators$8.next(Iterators.java:667) at com.google.common.collect.Iterators$5.hasNext(Iterators.java:475) at com.google.common.collect.ForwardingIterator.hasNext(ForwardingIterator.java:39) at org.apache.mahout.clustering.dirichlet.DirichletClusterMapper.loadClusters(DirichletClusterMapper.java:68) at org.apache.mahout.clustering.dirichlet.DirichletDriver.clusterDataSeq(DirichletDriver.java:487) at org.apache.mahout.clustering.dirichlet.DirichletDriver.clusterData(DirichletDriver.java:474) at org.apache.mahout.clustering.dirichlet.DirichletDriver.run(DirichletDriver.java:172) at org.apache.mahout.clustering.TestClusterDumper.testDirichlet2(TestClusterDumper.java:297) at org.apache.mahout.clustering.Test.main(Test.java:40) Caused by: java.io.FileNotFoundException: /home/praneet/Eclipse-Output/output/clusters-10-final/clusters-10 (Is a directory) at java.io.FileInputStream.open(Native Method) at java.io.FileInputStream.<init>(FileInputStream.java:137) at org.apache.hadoop.fs.RawLocalFileSystem$TrackingFileInputStream.<init>(RawLocalFileSystem.java:70) at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileInputStream.<init>(RawLocalFileSystem.java:106) at org.apache.hadoop.fs.RawLocalFileSystem.open(RawLocalFileSystem.java:176) at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.<init>(ChecksumFileSystem.java:126) at org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:283) at org.apache.hadoop.io.SequenceFile$Reader.openFile(SequenceFile.java:1437) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1417) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1412) at org.apache.mahout.common.iterator.sequencefile.SequenceFileValueIterator.<init>(SequenceFileValueIterator.java:51) at org.apache.mahout.common.iterator.sequencefile.SequenceFileDirValueIterator$1.apply(SequenceFileDirValueIterator.java:78) ... 10 more This is what I get when I try Path path = new Path("/home/praneet/Eclipse- Output/output/clusteredPoints/part-m-0"); instead of Path path = new Path("/home/praneet/Eclipse- Output/output/clusteredPoints"); Since the directory has only one file part-m-0, I do not need to read the whole directory. But I'll still try the approach you suggested and see how things work out. On Fri, Jan 6, 2012 at 9:09 PM, Sean Owen <[email protected]> wrote: > The error is right there: > > Exception in thread "main" java.io.FileNotFoundException: > /home/praneet/Eclipse-Output/output/clusteredPoints (Is a directory) > > You are passing a directory, not a file. > Look at the class SequenceFileDirIterable for an easy way to iterate > over all files in a directory as key-value pairs. > > On Sat, Jan 7, 2012 at 3:01 AM, praneet mhatre <[email protected]> > wrote: > > Hi Abin and Petar, > > > > I tried the above approach with Dirichlet clustering. I am using the > > following code snippet after clustering is completed. > > > > Configuration conf = new Configuration(); > > FileSystem fs = FileSystem.get(conf); > > Path path = new > > Path("/home/praneet/Eclipse-Output/output/clusteredPoints"); > > > > SequenceFile.Reader reader = new > SequenceFile.Reader(fs,path,conf); > > IntWritable key = new IntWritable(); > > WeightedVectorWritable value = new WeightedVectorWritable(); > > while(reader.next(key,value)) > > { > > System.out.print(value.toString() +" is in cluster " + > > key.toString() ); > > } > > System.out.println(); > > > > But I am getting the following error: > > > > SLF4J: Class path contains multiple SLF4J bindings. > > SLF4J: Found binding in > > > [jar:file:/home/praneet/.m2/repository/org/slf4j/slf4j-log4j12/1.6.1/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] > > SLF4J: Found binding in > > > [jar:file:/home/praneet/.m2/repository/org/slf4j/slf4j-jcl/1.6.1/slf4j-jcl-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] > > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > > explanation. > > 12/01/06 18:47:45 INFO dirichlet.DirichletDriver: Iteration 1 > > 12/01/06 18:47:45 INFO dirichlet.DirichletDriver: Iteration 2 > > 12/01/06 18:47:45 INFO dirichlet.DirichletDriver: Iteration 3 > > 12/01/06 18:47:45 INFO dirichlet.DirichletDriver: Iteration 4 > > 12/01/06 18:47:46 INFO dirichlet.DirichletDriver: Iteration 5 > > 12/01/06 18:47:46 INFO dirichlet.DirichletDriver: Iteration 6 > > 12/01/06 18:47:46 INFO dirichlet.DirichletDriver: Iteration 7 > > 12/01/06 18:47:46 INFO dirichlet.DirichletDriver: Iteration 8 > > 12/01/06 18:47:46 INFO dirichlet.DirichletDriver: Iteration 9 > > 12/01/06 18:47:46 INFO dirichlet.DirichletDriver: Iteration 10 > > 12/01/06 18:47:47 INFO clustering.ClusterDumper: Wrote 10 clusters > > Exception in thread "main" java.io.FileNotFoundException: > > /home/praneet/Eclipse-Output/output/clusteredPoints (Is a directory) > > at java.io.FileInputStream.open(Native Method) > > at java.io.FileInputStream.<init>(FileInputStream.java:137) > > at > > > org.apache.hadoop.fs.RawLocalFileSystem$TrackingFileInputStream.<init>(RawLocalFileSystem.java:70) > > at > > > org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileInputStream.<init>(RawLocalFileSystem.java:106) > > at > > org.apache.hadoop.fs.RawLocalFileSystem.open(RawLocalFileSystem.java:176) > > at > > > org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.<init>(ChecksumFileSystem.java:126) > > at > > org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:283) > > at > > org.apache.hadoop.io.SequenceFile$Reader.openFile(SequenceFile.java:1437) > > at > > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424) > > at > > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1417) > > at > > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1412) > > at org.apache.mahout.clustering.Test.main(Test.java:46) > > > > Any suggestions? > > > > On Wed, Dec 28, 2011 at 12:25 AM, petar.mitrovic < > [email protected]>wrote: > > > >> Hi Abin, > >> > >> Thank you very much! Your suggestion helped me a lot. > >> > >> First, I've set named vector parameter (-nv) to Mahout's vector > generation > >> process (seq2sparse) in order to write more descriptive vectors. > >> > >> Later, I could use something like this: > >> > >> IntWritable key= new IntWritable(); > >> WeightedVectorWritable vector = new WeightedVectorWritable(); > >> while (reader.next(key, vector)) { > >> NamedVector nv = (NamedVector) vector.getVector(); > >> System.out.println(nv.getName() + " belongs to cluster " + > >> key.toString()); > >> } > >> > >> Hope this can be useful for someone else, too. > >> > >> Regards, > >> Petar > >> > >> -- > >> View this message in context: > >> > http://lucene.472066.n3.nabble.com/How-to-determine-which-cluster-an-item-belongs-to-tp3613013p3615979.html > >> Sent from the Mahout User List mailing list archive at Nabble.com. > >> > > > > > > > > -- > > Praneet Mhatre > > Graduate Student > > Donald Bren School of ICS > > University of California, Irvine > -- Praneet Mhatre Graduate Student Donald Bren School of ICS University of California, Irvine
