Hi All,

I get the following error while running the kmeans algorithm with the
reuters dataset. The topic has been discussed in many forums (
http://search.lucidimagination.com/search/document/3f6b06ee9d45b4fe/tranforming_data_for_k_means_analysis)
but no one has mentioned a solution. Anyone faced this and has a solution?

dipti@dipti-laptop:~$ mahout kmeans -i vect-output/tf-vectors/part-r-00000
-k 15 --output kmeans-output --clusters kmeans-output/clusters --maxIter 200
Running on hadoop, using HADOOP_HOME=/usr/lib/hadoop-0.20.2/
HADOOP_CONF_DIR=/usr/lib/hadoop-0.20.2/conf
11/05/04 18:41:59 INFO common.AbstractJob: Command line arguments:
{--clusters=kmeans-output/clusters, --convergenceDelta=0.5,
--distanceMeasure=org.apache.mahout.common.distance.SquaredEuclideanDistanceMeasure,
--endPhase=2147483647, --input=vect-output/tf-vectors/part-r-00000,
--maxIter=200, --method=mapreduce, --numClusters=15, --output=kmeans-output,
--startPhase=0, --tempDir=temp}
11/05/04 18:41:59 INFO util.NativeCodeLoader: Loaded the native-hadoop
library
11/05/04 18:41:59 INFO zlib.ZlibFactory: Successfully loaded & initialized
native-zlib library
11/05/04 18:41:59 INFO compress.CodecPool: Got brand-new compressor
Exception in thread "main" java.lang.IndexOutOfBoundsException: Index: 0,
Size: 0
 at java.util.ArrayList.rangeCheck(ArrayList.java:571)
 at java.util.ArrayList.get(ArrayList.java:349)
at
org.apache.mahout.clustering.kmeans.RandomSeedGenerator.buildRandom(RandomSeedGenerator.java:107)
 at
org.apache.mahout.clustering.kmeans.KMeansDriver.run(KMeansDriver.java:96)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at
org.apache.mahout.clustering.kmeans.KMeansDriver.main(KMeansDriver.java:54)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:616)
at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
 at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
 at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:184)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:616)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

Regards,
Dipti Mathur

Reply via email to