Hey,

I was testing the kmeans driver using the reuters data.

Commands used:

1. bin/mahout seqdirectory -c UTF-8 -i reuters/reuters21578 -o
reuters/reuters-seqfiles
2. bin/mahout seq2sparse -i reuters/reuters-seqfiles/ -o
reuters/reuters-vectors-bigram -ow -a
org.apache.lucene.analysis.WhitespaceAnalyzer -chunk 200 -wt tf -s 5 -md 3
-x 90 -ng 1
3. bin/mahout kmeans -i reuters/reuters-vectors-bigram/ -c
reuters/reuters-initial-clusters -o reuters/reuters-kmeans-clusters -dm
org.apache.mahout.common.distance.SquaredEuclideanDistanceMeasure -k 20
--maxIter 100

I get the following exception. Am I doing anything wrong?

Exception in thread "main" java.lang.ClassCastException:
org.apache.hadoop.io.IntWritable cannot be cast to
org.apache.mahout.math.VectorWritable
    at
org.apache.mahout.clustering.kmeans.RandomSeedGenerator.buildRandom(RandomSeedGenerator.java:90)
    at
org.apache.mahout.clustering.kmeans.KMeansDriver.run(KMeansDriver.java:102)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at
org.apache.mahout.clustering.kmeans.KMeansDriver.main(KMeansDriver.java:59)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
    at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
    at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:188)


Thanks,
Sharath

Reply via email to