Hi,

Are you guys running on real Hadoop arrays? I can run the synthetic control example just fine on a single machine. That code is just trying to read a vector from a string. I'd be surprised if we were using any "features" but will watch the threads.

Jeff



Grant Ingersoll wrote:
I started a thread on [EMAIL PROTECTED]: http://hadoop.markmail.org/message/cczunzfhpcqz6pis


On Oct 27, 2008, at 9:49 PM, Grant Ingersoll wrote:

OK, I can confirm that the exact same code works with 0.17.2 and not w/ 0.18.1. So, it sounds like a bug in Hadoop, or we are relying on incorrect behavior in Hadoop.


On Oct 27, 2008, at 9:33 PM, Grant Ingersoll wrote:


On Oct 26, 2008, at 10:46 AM, Philippe Lamarche wrote:

Unfortunately, I went straight from 0.17.2 to 0.18.1. It was working on
0.17.2.


BTW, are you saying the same exact code was working on 0.17.2 or are you referring to some older Mahout code that worked on 17.2?




On Sun, Oct 26, 2008 at 9:48 AM, Grant Ingersoll <[EMAIL PROTECTED]>wrote:

Did this work with 0.18.0 or other prior versions for you?



On Oct 25, 2008, at 7:23 PM, Philippe Lamarche wrote:

Hi,

I just updated to hadoop 0.18.1 and got a clean version of mahout from
svn.
However, I am having problems with KMeans, that can be traced down to :

2008-10-25 19:10:16,987 INFO org.apache.hadoop.mapred.Merger: Merging
2 sorted segments
2008-10-25 19:10:16,987 INFO org.apache.hadoop.mapred.Merger: Down to
the last merge-pass, with 2 segments left of total size: 5011 bytes
2008-10-25 19:10:16,999 WARN org.apache.hadoop.mapred.ReduceTask:
attempt_200810251826_0013_r_000000_0 Merge of the inmemory files threw
an exception: java.io.IOException: Intermedate merge failed
    at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2147)
    at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2078)
Caused by: java.lang.NumberFormatException: For input string: "["
    at
sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:1224)
    at java.lang.Double.parseDouble(Double.java:510)
    at
org.apache.mahout.matrix.DenseVector.decodeFormat(DenseVector.java:60)
    at
org.apache.mahout.matrix.AbstractVector.decodeVector(AbstractVector.java:256)
    at
org.apache.mahout.clustering.kmeans.KMeansCombiner.reduce(KMeansCombiner.java:38)
    at
org.apache.mahout.clustering.kmeans.KMeansCombiner.reduce(KMeansCombiner.java:31)
    at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier.combineAndSpill(ReduceTask.java:2174)
    at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$3100(ReduceTask.java:341)
    at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2134)
    ... 1 more

2008-10-25 19:10:16,999 INFO org.apache.hadoop.mapred.ReduceTask:
In-memory merge complete: 0 files left.
2008-10-25 19:10:17,000 WARN org.apache.hadoop.mapred.TaskTracker:
Error running child
java.io.IOException: attempt_200810251826_0013_r_000000_0The reduce
copier failed
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:255)
    at
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)


This is while running the synthetic_control.data example, but I have the
same problems with any other input data.

I am able to do other map-reduce job without problems.

Here is the output of the jar task:

[EMAIL PROTECTED]:/usr/local/hadoop$ bin/hadoop jar

/home/philippe/workspace/MahoutJava/examples/dist/apache-mahout-examples-0.1-dev.jar
org.apache.mahout.clustering.syntheticcontrol.kmeans.Job
08/10/25 19:09:27 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
08/10/25 19:09:28 INFO mapred.FileInputFormat: Total input paths to
process
: 1
08/10/25 19:09:28 INFO mapred.FileInputFormat: Total input paths to
process
: 1
08/10/25 19:09:28 INFO mapred.JobClient: Running job:
job_200810251826_0010
08/10/25 19:09:29 INFO mapred.JobClient:  map 0% reduce 0%
08/10/25 19:09:31 INFO mapred.JobClient:  map 50% reduce 0%
08/10/25 19:09:32 INFO mapred.JobClient: Job complete:
job_200810251826_0010
08/10/25 19:09:32 INFO mapred.JobClient: Counters: 7
08/10/25 19:09:32 INFO mapred.JobClient:   File Systems
08/10/25 19:09:32 INFO mapred.JobClient:     HDFS bytes read=291644
08/10/25 19:09:32 INFO mapred.JobClient: HDFS bytes written=323660
08/10/25 19:09:32 INFO mapred.JobClient:   Job Counters
08/10/25 19:09:32 INFO mapred.JobClient:     Launched map tasks=2
08/10/25 19:09:32 INFO mapred.JobClient:     Data-local map tasks=2
08/10/25 19:09:32 INFO mapred.JobClient:   Map-Reduce Framework
08/10/25 19:09:32 INFO mapred.JobClient:     Map input records=600
08/10/25 19:09:32 INFO mapred.JobClient:     Map input bytes=288374
08/10/25 19:09:32 INFO mapred.JobClient:     Map output records=600
08/10/25 19:09:32 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
08/10/25 19:09:32 INFO mapred.FileInputFormat: Total input paths to
process
: 2
08/10/25 19:09:32 INFO mapred.FileInputFormat: Total input paths to
process
: 2
08/10/25 19:09:32 INFO mapred.JobClient: Running job:
job_200810251826_0011
08/10/25 19:09:33 INFO mapred.JobClient:  map 0% reduce 0%
08/10/25 19:09:37 INFO mapred.JobClient:  map 50% reduce 0%
08/10/25 19:09:39 INFO mapred.JobClient:  map 100% reduce 0%
08/10/25 19:09:44 INFO mapred.JobClient:  map 100% reduce 16%
08/10/25 19:09:52 INFO mapred.JobClient: Job complete:
job_200810251826_0011
08/10/25 19:09:52 INFO mapred.JobClient: Counters: 16
08/10/25 19:09:52 INFO mapred.JobClient:   File Systems
08/10/25 19:09:52 INFO mapred.JobClient:     HDFS bytes read=323660
08/10/25 19:09:52 INFO mapred.JobClient:     HDFS bytes written=1447
08/10/25 19:09:52 INFO mapred.JobClient:     Local bytes read=1389
08/10/25 19:09:52 INFO mapred.JobClient: Local bytes written=37878
08/10/25 19:09:52 INFO mapred.JobClient:   Job Counters
08/10/25 19:09:52 INFO mapred.JobClient:     Launched reduce tasks=1
08/10/25 19:09:52 INFO mapred.JobClient:     Launched map tasks=2
08/10/25 19:09:52 INFO mapred.JobClient:     Data-local map tasks=2
08/10/25 19:09:52 INFO mapred.JobClient:   Map-Reduce Framework
08/10/25 19:09:52 INFO mapred.JobClient:     Reduce input groups=1
08/10/25 19:09:52 INFO mapred.JobClient: Combine output records=29
08/10/25 19:09:52 INFO mapred.JobClient:     Map input records=600
08/10/25 19:09:52 INFO mapred.JobClient:     Reduce output records=1
08/10/25 19:09:52 INFO mapred.JobClient:     Map output bytes=943020
08/10/25 19:09:52 INFO mapred.JobClient:     Map input bytes=323660
08/10/25 19:09:52 INFO mapred.JobClient: Combine input records=1760
08/10/25 19:09:52 INFO mapred.JobClient:     Map output records=1732
08/10/25 19:09:52 INFO mapred.JobClient:     Reduce input records=1
08/10/25 19:09:53 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
08/10/25 19:09:53 INFO mapred.FileInputFormat: Total input paths to
process
: 2
08/10/25 19:09:53 INFO mapred.FileInputFormat: Total input paths to
process
: 2
08/10/25 19:09:53 INFO mapred.JobClient: Running job:
job_200810251826_0012
08/10/25 19:09:54 INFO mapred.JobClient:  map 0% reduce 0%
08/10/25 19:09:56 INFO mapred.JobClient:  map 50% reduce 0%
08/10/25 19:09:58 INFO mapred.JobClient:  map 100% reduce 0%
08/10/25 19:10:02 INFO mapred.JobClient: Job complete:
job_200810251826_0012
08/10/25 19:10:02 INFO mapred.JobClient: Counters: 16
08/10/25 19:10:02 INFO mapred.JobClient:   File Systems
08/10/25 19:10:02 INFO mapred.JobClient:     HDFS bytes read=326554
08/10/25 19:10:02 INFO mapred.JobClient: HDFS bytes written=1137260 08/10/25 19:10:02 INFO mapred.JobClient: Local bytes read=1147358 08/10/25 19:10:02 INFO mapred.JobClient: Local bytes written=2304490
08/10/25 19:10:02 INFO mapred.JobClient:   Job Counters
08/10/25 19:10:02 INFO mapred.JobClient:     Launched reduce tasks=1
08/10/25 19:10:02 INFO mapred.JobClient:     Launched map tasks=2
08/10/25 19:10:02 INFO mapred.JobClient:     Data-local map tasks=2
08/10/25 19:10:02 INFO mapred.JobClient:   Map-Reduce Framework
08/10/25 19:10:02 INFO mapred.JobClient:     Reduce input groups=1
08/10/25 19:10:02 INFO mapred.JobClient: Combine output records=0
08/10/25 19:10:02 INFO mapred.JobClient:     Map input records=600
08/10/25 19:10:02 INFO mapred.JobClient: Reduce output records=600 08/10/25 19:10:02 INFO mapred.JobClient: Map output bytes=1139660
08/10/25 19:10:02 INFO mapred.JobClient:     Map input bytes=323660
08/10/25 19:10:02 INFO mapred.JobClient:     Combine input records=0
08/10/25 19:10:02 INFO mapred.JobClient:     Map output records=600
08/10/25 19:10:02 INFO mapred.JobClient: Reduce input records=600
08/10/25 19:10:02 INFO kmeans.KMeansDriver: Iteration 0
08/10/25 19:10:02 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
08/10/25 19:10:02 INFO mapred.FileInputFormat: Total input paths to
process
: 2
08/10/25 19:10:02 INFO mapred.FileInputFormat: Total input paths to
process
: 2
08/10/25 19:10:03 INFO mapred.JobClient: Running job:
job_200810251826_0013
08/10/25 19:10:04 INFO mapred.JobClient:  map 0% reduce 0%
08/10/25 19:10:08 INFO mapred.JobClient:  map 50% reduce 0%
08/10/25 19:10:09 INFO mapred.JobClient:  map 100% reduce 0%
08/10/25 19:10:21 INFO mapred.JobClient: Task Id :
attempt_200810251826_0013_r_000000_0, Status : FAILED
java.io.IOException: attempt_200810251826_0013_r_000000_0The reduce copier
failed
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:255)
at
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)


I am not sure if I am doing something wrong here.

Thanks for the help,

Philippe.


--------------------------
Grant Ingersoll
Lucene Boot Camp Training Nov. 3-4, 2008, ApacheCon US New Orleans.
http://www.lucenebootcamp.com


Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ











--------------------------
Grant Ingersoll
Lucene Boot Camp Training Nov. 3-4, 2008, ApacheCon US New Orleans.
http://www.lucenebootcamp.com


Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ










--------------------------
Grant Ingersoll
Lucene Boot Camp Training Nov. 3-4, 2008, ApacheCon US New Orleans.
http://www.lucenebootcamp.com


Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ










--------------------------
Grant Ingersoll
Lucene Boot Camp Training Nov. 3-4, 2008, ApacheCon US New Orleans.
http://www.lucenebootcamp.com


Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ












Reply via email to