I will!
On 10/29/08, Grant Ingersoll <[EMAIL PROTECTED]> wrote: > > Philippe, can you try the patch suggested by Arun Murthy on > [EMAIL PROTECTED] See > http://issues.apache.org/jira/browse/HADOOP-4277 > > I'm pretty swamped at the moment w/ ApacheCon coming up next week, but if > it does fix the issue, then maybe we should move forward to the 18.2 > candidate (I don't think it has been released yet, those guys have a pretty > sophisticated build process going) > > -Grant > > On Oct 28, 2008, at 7:19 AM, Philippe Lamarche wrote: > > Ubuntu linux 2.6.24 <http://2.6.24.21>, with java-6-sun-1.6.0.07. >> >> On Tue, Oct 28, 2008 at 7:03 AM, Grant Ingersoll <[EMAIL PROTECTED] >> >wrote: >> >> Just a single machine. I didn't think we were using features either. Are >>> you saying you can run the example using 0.18.1? >>> >>> BTW, Philippe, what JVM, O/S, etc. are you using? >>> >>> -Grant >>> >>> >>> On Oct 27, 2008, at 11:55 PM, Jeff Eastman wrote: >>> >>> Hi, >>> >>>> >>>> Are you guys running on real Hadoop arrays? I can run the synthetic >>>> control example just fine on a single machine. That code is just trying >>>> to >>>> read a vector from a string. I'd be surprised if we were using any >>>> "features" but will watch the threads. >>>> >>>> Jeff >>>> >>>> >>>> >>>> Grant Ingersoll wrote: >>>> >>>> I started a thread on [EMAIL PROTECTED]: >>>>> http://hadoop.markmail.org/message/cczunzfhpcqz6pis >>>>> >>>>> >>>>> On Oct 27, 2008, at 9:49 PM, Grant Ingersoll wrote: >>>>> >>>>> OK, I can confirm that the exact same code works with 0.17.2 and not w/ >>>>> >>>>>> 0.18.1. So, it sounds like a bug in Hadoop, or we are relying on >>>>>> incorrect behavior in Hadoop. >>>>>> >>>>>> >>>>>> On Oct 27, 2008, at 9:33 PM, Grant Ingersoll wrote: >>>>>> >>>>>> >>>>>> On Oct 26, 2008, at 10:46 AM, Philippe Lamarche wrote: >>>>>>> >>>>>>> Unfortunately, I went straight from 0.17.2 to 0.18.1. It was >>>>>>> working >>>>>>> >>>>>>>> on >>>>>>>> 0.17.2. >>>>>>>> >>>>>>>> >>>>>>>> BTW, are you saying the same exact code was working on 0.17.2 or are >>>>>>> you referring to some older Mahout code that worked on 17.2? >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> On Sun, Oct 26, 2008 at 9:48 AM, Grant Ingersoll < >>>>>>>> [EMAIL PROTECTED] >>>>>>>> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>> >>>>>>>> Did this work with 0.18.0 or other prior versions for you? >>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Oct 25, 2008, at 7:23 PM, Philippe Lamarche wrote: >>>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> >>>>>>>>>> I just updated to hadoop 0.18.1 and got a clean version of mahout >>>>>>>>>> from >>>>>>>>>> svn. >>>>>>>>>> However, I am having problems with KMeans, that can be traced down >>>>>>>>>> to : >>>>>>>>>> >>>>>>>>>> 2008-10-25 19:10:16,987 INFO org.apache.hadoop.mapred.Merger: >>>>>>>>>> Merging >>>>>>>>>> 2 sorted segments >>>>>>>>>> 2008-10-25 19:10:16,987 INFO org.apache.hadoop.mapred.Merger: Down >>>>>>>>>> to >>>>>>>>>> the last merge-pass, with 2 segments left of total size: 5011 >>>>>>>>>> bytes >>>>>>>>>> 2008-10-25 19:10:16,999 WARN org.apache.hadoop.mapred.ReduceTask: >>>>>>>>>> attempt_200810251826_0013_r_000000_0 Merge of the inmemory files >>>>>>>>>> threw >>>>>>>>>> an exception: java.io.IOException: Intermedate merge failed >>>>>>>>>> at >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2147) >>>>>>>>>> at >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2078) >>>>>>>>>> Caused by: java.lang.NumberFormatException: For input string: "[" >>>>>>>>>> at >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:1224) >>>>>>>>>> at java.lang.Double.parseDouble(Double.java:510) >>>>>>>>>> at >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> org.apache.mahout.matrix.DenseVector.decodeFormat(DenseVector.java:60) >>>>>>>>>> at >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> org.apache.mahout.matrix.AbstractVector.decodeVector(AbstractVector.java:256) >>>>>>>>>> at >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> org.apache.mahout.clustering.kmeans.KMeansCombiner.reduce(KMeansCombiner.java:38) >>>>>>>>>> at >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> org.apache.mahout.clustering.kmeans.KMeansCombiner.reduce(KMeansCombiner.java:31) >>>>>>>>>> at >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.combineAndSpill(ReduceTask.java:2174) >>>>>>>>>> at >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$3100(ReduceTask.java:341) >>>>>>>>>> at >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2134) >>>>>>>>>> ... 1 more >>>>>>>>>> >>>>>>>>>> 2008-10-25 19:10:16,999 INFO org.apache.hadoop.mapred.ReduceTask: >>>>>>>>>> In-memory merge complete: 0 files left. >>>>>>>>>> 2008-10-25 19:10:17,000 WARN org.apache.hadoop.mapred.TaskTracker: >>>>>>>>>> Error running child >>>>>>>>>> java.io.IOException: attempt_200810251826_0013_r_000000_0The >>>>>>>>>> reduce >>>>>>>>>> copier failed >>>>>>>>>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:255) >>>>>>>>>> at >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207) >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> This is while running the synthetic_control.data example, but I >>>>>>>>>> have >>>>>>>>>> the >>>>>>>>>> same problems with any other input data. >>>>>>>>>> >>>>>>>>>> I am able to do other map-reduce job without problems. >>>>>>>>>> >>>>>>>>>> Here is the output of the jar task: >>>>>>>>>> >>>>>>>>>> [EMAIL PROTECTED]:/usr/local/hadoop$ bin/hadoop jar >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> /home/philippe/workspace/MahoutJava/examples/dist/apache-mahout-examples-0.1-dev.jar >>>>>>>>>> org.apache.mahout.clustering.syntheticcontrol.kmeans.Job >>>>>>>>>> 08/10/25 19:09:27 WARN mapred.JobClient: Use GenericOptionsParser >>>>>>>>>> for >>>>>>>>>> parsing the arguments. Applications should implement Tool for the >>>>>>>>>> same. >>>>>>>>>> 08/10/25 19:09:28 INFO mapred.FileInputFormat: Total input paths >>>>>>>>>> to >>>>>>>>>> process >>>>>>>>>> : 1 >>>>>>>>>> 08/10/25 19:09:28 INFO mapred.FileInputFormat: Total input paths >>>>>>>>>> to >>>>>>>>>> process >>>>>>>>>> : 1 >>>>>>>>>> 08/10/25 19:09:28 INFO mapred.JobClient: Running job: >>>>>>>>>> job_200810251826_0010 >>>>>>>>>> 08/10/25 19:09:29 INFO mapred.JobClient: map 0% reduce 0% >>>>>>>>>> 08/10/25 19:09:31 INFO mapred.JobClient: map 50% reduce 0% >>>>>>>>>> 08/10/25 19:09:32 INFO mapred.JobClient: Job complete: >>>>>>>>>> job_200810251826_0010 >>>>>>>>>> 08/10/25 19:09:32 INFO mapred.JobClient: Counters: 7 >>>>>>>>>> 08/10/25 19:09:32 INFO mapred.JobClient: File Systems >>>>>>>>>> 08/10/25 19:09:32 INFO mapred.JobClient: HDFS bytes >>>>>>>>>> read=291644 >>>>>>>>>> 08/10/25 19:09:32 INFO mapred.JobClient: HDFS bytes >>>>>>>>>> written=323660 >>>>>>>>>> 08/10/25 19:09:32 INFO mapred.JobClient: Job Counters >>>>>>>>>> 08/10/25 19:09:32 INFO mapred.JobClient: Launched map tasks=2 >>>>>>>>>> 08/10/25 19:09:32 INFO mapred.JobClient: Data-local map >>>>>>>>>> tasks=2 >>>>>>>>>> 08/10/25 19:09:32 INFO mapred.JobClient: Map-Reduce Framework >>>>>>>>>> 08/10/25 19:09:32 INFO mapred.JobClient: Map input records=600 >>>>>>>>>> 08/10/25 19:09:32 INFO mapred.JobClient: Map input >>>>>>>>>> bytes=288374 >>>>>>>>>> 08/10/25 19:09:32 INFO mapred.JobClient: Map output >>>>>>>>>> records=600 >>>>>>>>>> 08/10/25 19:09:32 WARN mapred.JobClient: Use GenericOptionsParser >>>>>>>>>> for >>>>>>>>>> parsing the arguments. Applications should implement Tool for the >>>>>>>>>> same. >>>>>>>>>> 08/10/25 19:09:32 INFO mapred.FileInputFormat: Total input paths >>>>>>>>>> to >>>>>>>>>> process >>>>>>>>>> : 2 >>>>>>>>>> 08/10/25 19:09:32 INFO mapred.FileInputFormat: Total input paths >>>>>>>>>> to >>>>>>>>>> process >>>>>>>>>> : 2 >>>>>>>>>> 08/10/25 19:09:32 INFO mapred.JobClient: Running job: >>>>>>>>>> job_200810251826_0011 >>>>>>>>>> 08/10/25 19:09:33 INFO mapred.JobClient: map 0% reduce 0% >>>>>>>>>> 08/10/25 19:09:37 INFO mapred.JobClient: map 50% reduce 0% >>>>>>>>>> 08/10/25 19:09:39 INFO mapred.JobClient: map 100% reduce 0% >>>>>>>>>> 08/10/25 19:09:44 INFO mapred.JobClient: map 100% reduce 16% >>>>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient: Job complete: >>>>>>>>>> job_200810251826_0011 >>>>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient: Counters: 16 >>>>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient: File Systems >>>>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient: HDFS bytes >>>>>>>>>> read=323660 >>>>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient: HDFS bytes >>>>>>>>>> written=1447 >>>>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient: Local bytes read=1389 >>>>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient: Local bytes >>>>>>>>>> written=37878 >>>>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient: Job Counters >>>>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient: Launched reduce >>>>>>>>>> tasks=1 >>>>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient: Launched map tasks=2 >>>>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient: Data-local map >>>>>>>>>> tasks=2 >>>>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient: Map-Reduce Framework >>>>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient: Reduce input groups=1 >>>>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient: Combine output >>>>>>>>>> records=29 >>>>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient: Map input records=600 >>>>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient: Reduce output >>>>>>>>>> records=1 >>>>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient: Map output >>>>>>>>>> bytes=943020 >>>>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient: Map input >>>>>>>>>> bytes=323660 >>>>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient: Combine input >>>>>>>>>> records=1760 >>>>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient: Map output >>>>>>>>>> records=1732 >>>>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient: Reduce input >>>>>>>>>> records=1 >>>>>>>>>> 08/10/25 19:09:53 WARN mapred.JobClient: Use GenericOptionsParser >>>>>>>>>> for >>>>>>>>>> parsing the arguments. Applications should implement Tool for the >>>>>>>>>> same. >>>>>>>>>> 08/10/25 19:09:53 INFO mapred.FileInputFormat: Total input paths >>>>>>>>>> to >>>>>>>>>> process >>>>>>>>>> : 2 >>>>>>>>>> 08/10/25 19:09:53 INFO mapred.FileInputFormat: Total input paths >>>>>>>>>> to >>>>>>>>>> process >>>>>>>>>> : 2 >>>>>>>>>> 08/10/25 19:09:53 INFO mapred.JobClient: Running job: >>>>>>>>>> job_200810251826_0012 >>>>>>>>>> 08/10/25 19:09:54 INFO mapred.JobClient: map 0% reduce 0% >>>>>>>>>> 08/10/25 19:09:56 INFO mapred.JobClient: map 50% reduce 0% >>>>>>>>>> 08/10/25 19:09:58 INFO mapred.JobClient: map 100% reduce 0% >>>>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient: Job complete: >>>>>>>>>> job_200810251826_0012 >>>>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient: Counters: 16 >>>>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient: File Systems >>>>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient: HDFS bytes >>>>>>>>>> read=326554 >>>>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient: HDFS bytes >>>>>>>>>> written=1137260 >>>>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient: Local bytes >>>>>>>>>> read=1147358 >>>>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient: Local bytes >>>>>>>>>> written=2304490 >>>>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient: Job Counters >>>>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient: Launched reduce >>>>>>>>>> tasks=1 >>>>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient: Launched map tasks=2 >>>>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient: Data-local map >>>>>>>>>> tasks=2 >>>>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient: Map-Reduce Framework >>>>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient: Reduce input groups=1 >>>>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient: Combine output >>>>>>>>>> records=0 >>>>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient: Map input records=600 >>>>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient: Reduce output >>>>>>>>>> records=600 >>>>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient: Map output >>>>>>>>>> bytes=1139660 >>>>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient: Map input >>>>>>>>>> bytes=323660 >>>>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient: Combine input >>>>>>>>>> records=0 >>>>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient: Map output >>>>>>>>>> records=600 >>>>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient: Reduce input >>>>>>>>>> records=600 >>>>>>>>>> 08/10/25 19:10:02 INFO kmeans.KMeansDriver: Iteration 0 >>>>>>>>>> 08/10/25 19:10:02 WARN mapred.JobClient: Use GenericOptionsParser >>>>>>>>>> for >>>>>>>>>> parsing the arguments. Applications should implement Tool for the >>>>>>>>>> same. >>>>>>>>>> 08/10/25 19:10:02 INFO mapred.FileInputFormat: Total input paths >>>>>>>>>> to >>>>>>>>>> process >>>>>>>>>> : 2 >>>>>>>>>> 08/10/25 19:10:02 INFO mapred.FileInputFormat: Total input paths >>>>>>>>>> to >>>>>>>>>> process >>>>>>>>>> : 2 >>>>>>>>>> 08/10/25 19:10:03 INFO mapred.JobClient: Running job: >>>>>>>>>> job_200810251826_0013 >>>>>>>>>> 08/10/25 19:10:04 INFO mapred.JobClient: map 0% reduce 0% >>>>>>>>>> 08/10/25 19:10:08 INFO mapred.JobClient: map 50% reduce 0% >>>>>>>>>> 08/10/25 19:10:09 INFO mapred.JobClient: map 100% reduce 0% >>>>>>>>>> 08/10/25 19:10:21 INFO mapred.JobClient: Task Id : >>>>>>>>>> attempt_200810251826_0013_r_000000_0, Status : FAILED >>>>>>>>>> java.io.IOException: attempt_200810251826_0013_r_000000_0The >>>>>>>>>> reduce >>>>>>>>>> copier >>>>>>>>>> failed >>>>>>>>>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:255) >>>>>>>>>> at >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207) >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I am not sure if I am doing something wrong here. >>>>>>>>>> >>>>>>>>>> Thanks for the help, >>>>>>>>>> >>>>>>>>>> Philippe. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -------------------------- >>>>>>>>> Grant Ingersoll >>>>>>>>> Lucene Boot Camp Training Nov. 3-4, 2008, ApacheCon US New Orleans. >>>>>>>>> http://www.lucenebootcamp.com >>>>>>>>> >>>>>>>>> >>>>>>>>> Lucene Helpful Hints: >>>>>>>>> http://wiki.apache.org/lucene-java/BasicsOfPerformance >>>>>>>>> http://wiki.apache.org/lucene-java/LuceneFAQ >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -------------------------- >>>>>>> Grant Ingersoll >>>>>>> Lucene Boot Camp Training Nov. 3-4, 2008, ApacheCon US New Orleans. >>>>>>> http://www.lucenebootcamp.com >>>>>>> >>>>>>> >>>>>>> Lucene Helpful Hints: >>>>>>> http://wiki.apache.org/lucene-java/BasicsOfPerformance >>>>>>> http://wiki.apache.org/lucene-java/LuceneFAQ >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -------------------------- >>>>>> Grant Ingersoll >>>>>> Lucene Boot Camp Training Nov. 3-4, 2008, ApacheCon US New Orleans. >>>>>> http://www.lucenebootcamp.com >>>>>> >>>>>> >>>>>> Lucene Helpful Hints: >>>>>> http://wiki.apache.org/lucene-java/BasicsOfPerformance >>>>>> http://wiki.apache.org/lucene-java/LuceneFAQ >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -------------------------- >>>>> Grant Ingersoll >>>>> Lucene Boot Camp Training Nov. 3-4, 2008, ApacheCon US New Orleans. >>>>> http://www.lucenebootcamp.com >>>>> >>>>> >>>>> Lucene Helpful Hints: >>>>> http://wiki.apache.org/lucene-java/BasicsOfPerformance >>>>> http://wiki.apache.org/lucene-java/LuceneFAQ >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>> -------------------------- >>> Grant Ingersoll >>> Lucene Boot Camp Training Nov. 3-4, 2008, ApacheCon US New Orleans. >>> http://www.lucenebootcamp.com >>> >>> >>> Lucene Helpful Hints: >>> http://wiki.apache.org/lucene-java/BasicsOfPerformance >>> http://wiki.apache.org/lucene-java/LuceneFAQ >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> > -------------------------- > Grant Ingersoll > Lucene Boot Camp Training Nov. 3-4, 2008, ApacheCon US New Orleans. > http://www.lucenebootcamp.com > > > Lucene Helpful Hints: > http://wiki.apache.org/lucene-java/BasicsOfPerformance > http://wiki.apache.org/lucene-java/LuceneFAQ > > > > > > > > > >
