Ubuntu linux 2.6.24 <http://2.6.24.21>, with java-6-sun-1.6.0.07.
On Tue, Oct 28, 2008 at 7:03 AM, Grant Ingersoll <[EMAIL PROTECTED]>wrote: > Just a single machine. I didn't think we were using features either. Are > you saying you can run the example using 0.18.1? > > BTW, Philippe, what JVM, O/S, etc. are you using? > > -Grant > > > On Oct 27, 2008, at 11:55 PM, Jeff Eastman wrote: > > Hi, >> >> Are you guys running on real Hadoop arrays? I can run the synthetic >> control example just fine on a single machine. That code is just trying to >> read a vector from a string. I'd be surprised if we were using any >> "features" but will watch the threads. >> >> Jeff >> >> >> >> Grant Ingersoll wrote: >> >>> I started a thread on [EMAIL PROTECTED]: >>> http://hadoop.markmail.org/message/cczunzfhpcqz6pis >>> >>> >>> On Oct 27, 2008, at 9:49 PM, Grant Ingersoll wrote: >>> >>> OK, I can confirm that the exact same code works with 0.17.2 and not w/ >>>> 0.18.1. So, it sounds like a bug in Hadoop, or we are relying on >>>> incorrect behavior in Hadoop. >>>> >>>> >>>> On Oct 27, 2008, at 9:33 PM, Grant Ingersoll wrote: >>>> >>>> >>>>> On Oct 26, 2008, at 10:46 AM, Philippe Lamarche wrote: >>>>> >>>>> Unfortunately, I went straight from 0.17.2 to 0.18.1. It was working >>>>>> on >>>>>> 0.17.2. >>>>>> >>>>>> >>>>> BTW, are you saying the same exact code was working on 0.17.2 or are >>>>> you referring to some older Mahout code that worked on 17.2? >>>>> >>>>> >>>>> >>>>>> >>>>>> On Sun, Oct 26, 2008 at 9:48 AM, Grant Ingersoll <[EMAIL PROTECTED] >>>>>> >wrote: >>>>>> >>>>>> Did this work with 0.18.0 or other prior versions for you? >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Oct 25, 2008, at 7:23 PM, Philippe Lamarche wrote: >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>>> >>>>>>>> I just updated to hadoop 0.18.1 and got a clean version of mahout >>>>>>>> from >>>>>>>> svn. >>>>>>>> However, I am having problems with KMeans, that can be traced down >>>>>>>> to : >>>>>>>> >>>>>>>> 2008-10-25 19:10:16,987 INFO org.apache.hadoop.mapred.Merger: >>>>>>>> Merging >>>>>>>> 2 sorted segments >>>>>>>> 2008-10-25 19:10:16,987 INFO org.apache.hadoop.mapred.Merger: Down >>>>>>>> to >>>>>>>> the last merge-pass, with 2 segments left of total size: 5011 bytes >>>>>>>> 2008-10-25 19:10:16,999 WARN org.apache.hadoop.mapred.ReduceTask: >>>>>>>> attempt_200810251826_0013_r_000000_0 Merge of the inmemory files >>>>>>>> threw >>>>>>>> an exception: java.io.IOException: Intermedate merge failed >>>>>>>> at >>>>>>>> >>>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2147) >>>>>>>> at >>>>>>>> >>>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2078) >>>>>>>> Caused by: java.lang.NumberFormatException: For input string: "[" >>>>>>>> at >>>>>>>> >>>>>>>> sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:1224) >>>>>>>> at java.lang.Double.parseDouble(Double.java:510) >>>>>>>> at >>>>>>>> >>>>>>>> org.apache.mahout.matrix.DenseVector.decodeFormat(DenseVector.java:60) >>>>>>>> at >>>>>>>> >>>>>>>> org.apache.mahout.matrix.AbstractVector.decodeVector(AbstractVector.java:256) >>>>>>>> at >>>>>>>> >>>>>>>> org.apache.mahout.clustering.kmeans.KMeansCombiner.reduce(KMeansCombiner.java:38) >>>>>>>> at >>>>>>>> >>>>>>>> org.apache.mahout.clustering.kmeans.KMeansCombiner.reduce(KMeansCombiner.java:31) >>>>>>>> at >>>>>>>> >>>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.combineAndSpill(ReduceTask.java:2174) >>>>>>>> at >>>>>>>> >>>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$3100(ReduceTask.java:341) >>>>>>>> at >>>>>>>> >>>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2134) >>>>>>>> ... 1 more >>>>>>>> >>>>>>>> 2008-10-25 19:10:16,999 INFO org.apache.hadoop.mapred.ReduceTask: >>>>>>>> In-memory merge complete: 0 files left. >>>>>>>> 2008-10-25 19:10:17,000 WARN org.apache.hadoop.mapred.TaskTracker: >>>>>>>> Error running child >>>>>>>> java.io.IOException: attempt_200810251826_0013_r_000000_0The reduce >>>>>>>> copier failed >>>>>>>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:255) >>>>>>>> at >>>>>>>> >>>>>>>> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207) >>>>>>>> >>>>>>>> >>>>>>>> This is while running the synthetic_control.data example, but I have >>>>>>>> the >>>>>>>> same problems with any other input data. >>>>>>>> >>>>>>>> I am able to do other map-reduce job without problems. >>>>>>>> >>>>>>>> Here is the output of the jar task: >>>>>>>> >>>>>>>> [EMAIL PROTECTED]:/usr/local/hadoop$ bin/hadoop jar >>>>>>>> >>>>>>>> >>>>>>>> /home/philippe/workspace/MahoutJava/examples/dist/apache-mahout-examples-0.1-dev.jar >>>>>>>> org.apache.mahout.clustering.syntheticcontrol.kmeans.Job >>>>>>>> 08/10/25 19:09:27 WARN mapred.JobClient: Use GenericOptionsParser >>>>>>>> for >>>>>>>> parsing the arguments. Applications should implement Tool for the >>>>>>>> same. >>>>>>>> 08/10/25 19:09:28 INFO mapred.FileInputFormat: Total input paths to >>>>>>>> process >>>>>>>> : 1 >>>>>>>> 08/10/25 19:09:28 INFO mapred.FileInputFormat: Total input paths to >>>>>>>> process >>>>>>>> : 1 >>>>>>>> 08/10/25 19:09:28 INFO mapred.JobClient: Running job: >>>>>>>> job_200810251826_0010 >>>>>>>> 08/10/25 19:09:29 INFO mapred.JobClient: map 0% reduce 0% >>>>>>>> 08/10/25 19:09:31 INFO mapred.JobClient: map 50% reduce 0% >>>>>>>> 08/10/25 19:09:32 INFO mapred.JobClient: Job complete: >>>>>>>> job_200810251826_0010 >>>>>>>> 08/10/25 19:09:32 INFO mapred.JobClient: Counters: 7 >>>>>>>> 08/10/25 19:09:32 INFO mapred.JobClient: File Systems >>>>>>>> 08/10/25 19:09:32 INFO mapred.JobClient: HDFS bytes read=291644 >>>>>>>> 08/10/25 19:09:32 INFO mapred.JobClient: HDFS bytes >>>>>>>> written=323660 >>>>>>>> 08/10/25 19:09:32 INFO mapred.JobClient: Job Counters >>>>>>>> 08/10/25 19:09:32 INFO mapred.JobClient: Launched map tasks=2 >>>>>>>> 08/10/25 19:09:32 INFO mapred.JobClient: Data-local map tasks=2 >>>>>>>> 08/10/25 19:09:32 INFO mapred.JobClient: Map-Reduce Framework >>>>>>>> 08/10/25 19:09:32 INFO mapred.JobClient: Map input records=600 >>>>>>>> 08/10/25 19:09:32 INFO mapred.JobClient: Map input bytes=288374 >>>>>>>> 08/10/25 19:09:32 INFO mapred.JobClient: Map output records=600 >>>>>>>> 08/10/25 19:09:32 WARN mapred.JobClient: Use GenericOptionsParser >>>>>>>> for >>>>>>>> parsing the arguments. Applications should implement Tool for the >>>>>>>> same. >>>>>>>> 08/10/25 19:09:32 INFO mapred.FileInputFormat: Total input paths to >>>>>>>> process >>>>>>>> : 2 >>>>>>>> 08/10/25 19:09:32 INFO mapred.FileInputFormat: Total input paths to >>>>>>>> process >>>>>>>> : 2 >>>>>>>> 08/10/25 19:09:32 INFO mapred.JobClient: Running job: >>>>>>>> job_200810251826_0011 >>>>>>>> 08/10/25 19:09:33 INFO mapred.JobClient: map 0% reduce 0% >>>>>>>> 08/10/25 19:09:37 INFO mapred.JobClient: map 50% reduce 0% >>>>>>>> 08/10/25 19:09:39 INFO mapred.JobClient: map 100% reduce 0% >>>>>>>> 08/10/25 19:09:44 INFO mapred.JobClient: map 100% reduce 16% >>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient: Job complete: >>>>>>>> job_200810251826_0011 >>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient: Counters: 16 >>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient: File Systems >>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient: HDFS bytes read=323660 >>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient: HDFS bytes written=1447 >>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient: Local bytes read=1389 >>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient: Local bytes >>>>>>>> written=37878 >>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient: Job Counters >>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient: Launched reduce tasks=1 >>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient: Launched map tasks=2 >>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient: Data-local map tasks=2 >>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient: Map-Reduce Framework >>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient: Reduce input groups=1 >>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient: Combine output >>>>>>>> records=29 >>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient: Map input records=600 >>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient: Reduce output records=1 >>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient: Map output bytes=943020 >>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient: Map input bytes=323660 >>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient: Combine input >>>>>>>> records=1760 >>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient: Map output records=1732 >>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient: Reduce input records=1 >>>>>>>> 08/10/25 19:09:53 WARN mapred.JobClient: Use GenericOptionsParser >>>>>>>> for >>>>>>>> parsing the arguments. Applications should implement Tool for the >>>>>>>> same. >>>>>>>> 08/10/25 19:09:53 INFO mapred.FileInputFormat: Total input paths to >>>>>>>> process >>>>>>>> : 2 >>>>>>>> 08/10/25 19:09:53 INFO mapred.FileInputFormat: Total input paths to >>>>>>>> process >>>>>>>> : 2 >>>>>>>> 08/10/25 19:09:53 INFO mapred.JobClient: Running job: >>>>>>>> job_200810251826_0012 >>>>>>>> 08/10/25 19:09:54 INFO mapred.JobClient: map 0% reduce 0% >>>>>>>> 08/10/25 19:09:56 INFO mapred.JobClient: map 50% reduce 0% >>>>>>>> 08/10/25 19:09:58 INFO mapred.JobClient: map 100% reduce 0% >>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient: Job complete: >>>>>>>> job_200810251826_0012 >>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient: Counters: 16 >>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient: File Systems >>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient: HDFS bytes read=326554 >>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient: HDFS bytes >>>>>>>> written=1137260 >>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient: Local bytes >>>>>>>> read=1147358 >>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient: Local bytes >>>>>>>> written=2304490 >>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient: Job Counters >>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient: Launched reduce tasks=1 >>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient: Launched map tasks=2 >>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient: Data-local map tasks=2 >>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient: Map-Reduce Framework >>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient: Reduce input groups=1 >>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient: Combine output >>>>>>>> records=0 >>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient: Map input records=600 >>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient: Reduce output >>>>>>>> records=600 >>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient: Map output >>>>>>>> bytes=1139660 >>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient: Map input bytes=323660 >>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient: Combine input records=0 >>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient: Map output records=600 >>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient: Reduce input >>>>>>>> records=600 >>>>>>>> 08/10/25 19:10:02 INFO kmeans.KMeansDriver: Iteration 0 >>>>>>>> 08/10/25 19:10:02 WARN mapred.JobClient: Use GenericOptionsParser >>>>>>>> for >>>>>>>> parsing the arguments. Applications should implement Tool for the >>>>>>>> same. >>>>>>>> 08/10/25 19:10:02 INFO mapred.FileInputFormat: Total input paths to >>>>>>>> process >>>>>>>> : 2 >>>>>>>> 08/10/25 19:10:02 INFO mapred.FileInputFormat: Total input paths to >>>>>>>> process >>>>>>>> : 2 >>>>>>>> 08/10/25 19:10:03 INFO mapred.JobClient: Running job: >>>>>>>> job_200810251826_0013 >>>>>>>> 08/10/25 19:10:04 INFO mapred.JobClient: map 0% reduce 0% >>>>>>>> 08/10/25 19:10:08 INFO mapred.JobClient: map 50% reduce 0% >>>>>>>> 08/10/25 19:10:09 INFO mapred.JobClient: map 100% reduce 0% >>>>>>>> 08/10/25 19:10:21 INFO mapred.JobClient: Task Id : >>>>>>>> attempt_200810251826_0013_r_000000_0, Status : FAILED >>>>>>>> java.io.IOException: attempt_200810251826_0013_r_000000_0The reduce >>>>>>>> copier >>>>>>>> failed >>>>>>>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:255) >>>>>>>> at >>>>>>>> >>>>>>>> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207) >>>>>>>> >>>>>>>> >>>>>>>> I am not sure if I am doing something wrong here. >>>>>>>> >>>>>>>> Thanks for the help, >>>>>>>> >>>>>>>> Philippe. >>>>>>>> >>>>>>>> >>>>>>> -------------------------- >>>>>>> Grant Ingersoll >>>>>>> Lucene Boot Camp Training Nov. 3-4, 2008, ApacheCon US New Orleans. >>>>>>> http://www.lucenebootcamp.com >>>>>>> >>>>>>> >>>>>>> Lucene Helpful Hints: >>>>>>> http://wiki.apache.org/lucene-java/BasicsOfPerformance >>>>>>> http://wiki.apache.org/lucene-java/LuceneFAQ >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>> -------------------------- >>>>> Grant Ingersoll >>>>> Lucene Boot Camp Training Nov. 3-4, 2008, ApacheCon US New Orleans. >>>>> http://www.lucenebootcamp.com >>>>> >>>>> >>>>> Lucene Helpful Hints: >>>>> http://wiki.apache.org/lucene-java/BasicsOfPerformance >>>>> http://wiki.apache.org/lucene-java/LuceneFAQ >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>> -------------------------- >>>> Grant Ingersoll >>>> Lucene Boot Camp Training Nov. 3-4, 2008, ApacheCon US New Orleans. >>>> http://www.lucenebootcamp.com >>>> >>>> >>>> Lucene Helpful Hints: >>>> http://wiki.apache.org/lucene-java/BasicsOfPerformance >>>> http://wiki.apache.org/lucene-java/LuceneFAQ >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>> -------------------------- >>> Grant Ingersoll >>> Lucene Boot Camp Training Nov. 3-4, 2008, ApacheCon US New Orleans. >>> http://www.lucenebootcamp.com >>> >>> >>> Lucene Helpful Hints: >>> http://wiki.apache.org/lucene-java/BasicsOfPerformance >>> http://wiki.apache.org/lucene-java/LuceneFAQ >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >> > -------------------------- > Grant Ingersoll > Lucene Boot Camp Training Nov. 3-4, 2008, ApacheCon US New Orleans. > http://www.lucenebootcamp.com > > > Lucene Helpful Hints: > http://wiki.apache.org/lucene-java/BasicsOfPerformance > http://wiki.apache.org/lucene-java/LuceneFAQ > > > > > > > > > >
