Unfortunately, I went straight from 0.17.2 to 0.18.1. It was working on 0.17.2.
On Sun, Oct 26, 2008 at 9:48 AM, Grant Ingersoll <[EMAIL PROTECTED]>wrote: > Did this work with 0.18.0 or other prior versions for you? > > > > On Oct 25, 2008, at 7:23 PM, Philippe Lamarche wrote: > > Hi, >> >> I just updated to hadoop 0.18.1 and got a clean version of mahout from >> svn. >> However, I am having problems with KMeans, that can be traced down to : >> >> 2008-10-25 19:10:16,987 INFO org.apache.hadoop.mapred.Merger: Merging >> 2 sorted segments >> 2008-10-25 19:10:16,987 INFO org.apache.hadoop.mapred.Merger: Down to >> the last merge-pass, with 2 segments left of total size: 5011 bytes >> 2008-10-25 19:10:16,999 WARN org.apache.hadoop.mapred.ReduceTask: >> attempt_200810251826_0013_r_000000_0 Merge of the inmemory files threw >> an exception: java.io.IOException: Intermedate merge failed >> at >> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2147) >> at >> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2078) >> Caused by: java.lang.NumberFormatException: For input string: "[" >> at >> sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:1224) >> at java.lang.Double.parseDouble(Double.java:510) >> at >> org.apache.mahout.matrix.DenseVector.decodeFormat(DenseVector.java:60) >> at >> org.apache.mahout.matrix.AbstractVector.decodeVector(AbstractVector.java:256) >> at >> org.apache.mahout.clustering.kmeans.KMeansCombiner.reduce(KMeansCombiner.java:38) >> at >> org.apache.mahout.clustering.kmeans.KMeansCombiner.reduce(KMeansCombiner.java:31) >> at >> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.combineAndSpill(ReduceTask.java:2174) >> at >> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$3100(ReduceTask.java:341) >> at >> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2134) >> ... 1 more >> >> 2008-10-25 19:10:16,999 INFO org.apache.hadoop.mapred.ReduceTask: >> In-memory merge complete: 0 files left. >> 2008-10-25 19:10:17,000 WARN org.apache.hadoop.mapred.TaskTracker: >> Error running child >> java.io.IOException: attempt_200810251826_0013_r_000000_0The reduce >> copier failed >> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:255) >> at >> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207) >> >> >> This is while running the synthetic_control.data example, but I have the >> same problems with any other input data. >> >> I am able to do other map-reduce job without problems. >> >> Here is the output of the jar task: >> >> [EMAIL PROTECTED]:/usr/local/hadoop$ bin/hadoop jar >> >> /home/philippe/workspace/MahoutJava/examples/dist/apache-mahout-examples-0.1-dev.jar >> org.apache.mahout.clustering.syntheticcontrol.kmeans.Job >> 08/10/25 19:09:27 WARN mapred.JobClient: Use GenericOptionsParser for >> parsing the arguments. Applications should implement Tool for the same. >> 08/10/25 19:09:28 INFO mapred.FileInputFormat: Total input paths to >> process >> : 1 >> 08/10/25 19:09:28 INFO mapred.FileInputFormat: Total input paths to >> process >> : 1 >> 08/10/25 19:09:28 INFO mapred.JobClient: Running job: >> job_200810251826_0010 >> 08/10/25 19:09:29 INFO mapred.JobClient: map 0% reduce 0% >> 08/10/25 19:09:31 INFO mapred.JobClient: map 50% reduce 0% >> 08/10/25 19:09:32 INFO mapred.JobClient: Job complete: >> job_200810251826_0010 >> 08/10/25 19:09:32 INFO mapred.JobClient: Counters: 7 >> 08/10/25 19:09:32 INFO mapred.JobClient: File Systems >> 08/10/25 19:09:32 INFO mapred.JobClient: HDFS bytes read=291644 >> 08/10/25 19:09:32 INFO mapred.JobClient: HDFS bytes written=323660 >> 08/10/25 19:09:32 INFO mapred.JobClient: Job Counters >> 08/10/25 19:09:32 INFO mapred.JobClient: Launched map tasks=2 >> 08/10/25 19:09:32 INFO mapred.JobClient: Data-local map tasks=2 >> 08/10/25 19:09:32 INFO mapred.JobClient: Map-Reduce Framework >> 08/10/25 19:09:32 INFO mapred.JobClient: Map input records=600 >> 08/10/25 19:09:32 INFO mapred.JobClient: Map input bytes=288374 >> 08/10/25 19:09:32 INFO mapred.JobClient: Map output records=600 >> 08/10/25 19:09:32 WARN mapred.JobClient: Use GenericOptionsParser for >> parsing the arguments. Applications should implement Tool for the same. >> 08/10/25 19:09:32 INFO mapred.FileInputFormat: Total input paths to >> process >> : 2 >> 08/10/25 19:09:32 INFO mapred.FileInputFormat: Total input paths to >> process >> : 2 >> 08/10/25 19:09:32 INFO mapred.JobClient: Running job: >> job_200810251826_0011 >> 08/10/25 19:09:33 INFO mapred.JobClient: map 0% reduce 0% >> 08/10/25 19:09:37 INFO mapred.JobClient: map 50% reduce 0% >> 08/10/25 19:09:39 INFO mapred.JobClient: map 100% reduce 0% >> 08/10/25 19:09:44 INFO mapred.JobClient: map 100% reduce 16% >> 08/10/25 19:09:52 INFO mapred.JobClient: Job complete: >> job_200810251826_0011 >> 08/10/25 19:09:52 INFO mapred.JobClient: Counters: 16 >> 08/10/25 19:09:52 INFO mapred.JobClient: File Systems >> 08/10/25 19:09:52 INFO mapred.JobClient: HDFS bytes read=323660 >> 08/10/25 19:09:52 INFO mapred.JobClient: HDFS bytes written=1447 >> 08/10/25 19:09:52 INFO mapred.JobClient: Local bytes read=1389 >> 08/10/25 19:09:52 INFO mapred.JobClient: Local bytes written=37878 >> 08/10/25 19:09:52 INFO mapred.JobClient: Job Counters >> 08/10/25 19:09:52 INFO mapred.JobClient: Launched reduce tasks=1 >> 08/10/25 19:09:52 INFO mapred.JobClient: Launched map tasks=2 >> 08/10/25 19:09:52 INFO mapred.JobClient: Data-local map tasks=2 >> 08/10/25 19:09:52 INFO mapred.JobClient: Map-Reduce Framework >> 08/10/25 19:09:52 INFO mapred.JobClient: Reduce input groups=1 >> 08/10/25 19:09:52 INFO mapred.JobClient: Combine output records=29 >> 08/10/25 19:09:52 INFO mapred.JobClient: Map input records=600 >> 08/10/25 19:09:52 INFO mapred.JobClient: Reduce output records=1 >> 08/10/25 19:09:52 INFO mapred.JobClient: Map output bytes=943020 >> 08/10/25 19:09:52 INFO mapred.JobClient: Map input bytes=323660 >> 08/10/25 19:09:52 INFO mapred.JobClient: Combine input records=1760 >> 08/10/25 19:09:52 INFO mapred.JobClient: Map output records=1732 >> 08/10/25 19:09:52 INFO mapred.JobClient: Reduce input records=1 >> 08/10/25 19:09:53 WARN mapred.JobClient: Use GenericOptionsParser for >> parsing the arguments. Applications should implement Tool for the same. >> 08/10/25 19:09:53 INFO mapred.FileInputFormat: Total input paths to >> process >> : 2 >> 08/10/25 19:09:53 INFO mapred.FileInputFormat: Total input paths to >> process >> : 2 >> 08/10/25 19:09:53 INFO mapred.JobClient: Running job: >> job_200810251826_0012 >> 08/10/25 19:09:54 INFO mapred.JobClient: map 0% reduce 0% >> 08/10/25 19:09:56 INFO mapred.JobClient: map 50% reduce 0% >> 08/10/25 19:09:58 INFO mapred.JobClient: map 100% reduce 0% >> 08/10/25 19:10:02 INFO mapred.JobClient: Job complete: >> job_200810251826_0012 >> 08/10/25 19:10:02 INFO mapred.JobClient: Counters: 16 >> 08/10/25 19:10:02 INFO mapred.JobClient: File Systems >> 08/10/25 19:10:02 INFO mapred.JobClient: HDFS bytes read=326554 >> 08/10/25 19:10:02 INFO mapred.JobClient: HDFS bytes written=1137260 >> 08/10/25 19:10:02 INFO mapred.JobClient: Local bytes read=1147358 >> 08/10/25 19:10:02 INFO mapred.JobClient: Local bytes written=2304490 >> 08/10/25 19:10:02 INFO mapred.JobClient: Job Counters >> 08/10/25 19:10:02 INFO mapred.JobClient: Launched reduce tasks=1 >> 08/10/25 19:10:02 INFO mapred.JobClient: Launched map tasks=2 >> 08/10/25 19:10:02 INFO mapred.JobClient: Data-local map tasks=2 >> 08/10/25 19:10:02 INFO mapred.JobClient: Map-Reduce Framework >> 08/10/25 19:10:02 INFO mapred.JobClient: Reduce input groups=1 >> 08/10/25 19:10:02 INFO mapred.JobClient: Combine output records=0 >> 08/10/25 19:10:02 INFO mapred.JobClient: Map input records=600 >> 08/10/25 19:10:02 INFO mapred.JobClient: Reduce output records=600 >> 08/10/25 19:10:02 INFO mapred.JobClient: Map output bytes=1139660 >> 08/10/25 19:10:02 INFO mapred.JobClient: Map input bytes=323660 >> 08/10/25 19:10:02 INFO mapred.JobClient: Combine input records=0 >> 08/10/25 19:10:02 INFO mapred.JobClient: Map output records=600 >> 08/10/25 19:10:02 INFO mapred.JobClient: Reduce input records=600 >> 08/10/25 19:10:02 INFO kmeans.KMeansDriver: Iteration 0 >> 08/10/25 19:10:02 WARN mapred.JobClient: Use GenericOptionsParser for >> parsing the arguments. Applications should implement Tool for the same. >> 08/10/25 19:10:02 INFO mapred.FileInputFormat: Total input paths to >> process >> : 2 >> 08/10/25 19:10:02 INFO mapred.FileInputFormat: Total input paths to >> process >> : 2 >> 08/10/25 19:10:03 INFO mapred.JobClient: Running job: >> job_200810251826_0013 >> 08/10/25 19:10:04 INFO mapred.JobClient: map 0% reduce 0% >> 08/10/25 19:10:08 INFO mapred.JobClient: map 50% reduce 0% >> 08/10/25 19:10:09 INFO mapred.JobClient: map 100% reduce 0% >> 08/10/25 19:10:21 INFO mapred.JobClient: Task Id : >> attempt_200810251826_0013_r_000000_0, Status : FAILED >> java.io.IOException: attempt_200810251826_0013_r_000000_0The reduce copier >> failed >> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:255) >> at >> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207) >> >> >> I am not sure if I am doing something wrong here. >> >> Thanks for the help, >> >> Philippe. >> > > -------------------------- > Grant Ingersoll > Lucene Boot Camp Training Nov. 3-4, 2008, ApacheCon US New Orleans. > http://www.lucenebootcamp.com > > > Lucene Helpful Hints: > http://wiki.apache.org/lucene-java/BasicsOfPerformance > http://wiki.apache.org/lucene-java/LuceneFAQ > > > > > > > > > >
