On Tuesday 03 June 2008 04:53:22 Chris Douglas wrote: > Is anyone observing this outside of streaming? > > We've been able to reproduce this trace with a bad comparator that > only returns negative values, but haven't found any uncontrived > patterns in data that produce this, nor any comparators in 0.17 with > this property. A bad partitioner also returning only negative values > would behave similarly, but not this uniformly.
Ok, let's take a look, the hadoop call is like this: hadoop jar $HOME/hadoop-0.17.0/contrib/streaming/hadoop-0.17.0-streaming.jar -output /user/hadoop/$(basename $(pwd)) -mapper cat -reducer /home/hadoop/bin/lrp\ --stderr -jobconf mapred.reduce.tasks=88 $CMD The data is a representation for loglines, and not exactly small, e.g. the stuff has already been reduced once. The bug is probably triggered by size, because reducing the data in two seperate smaller runs work fine. I have no small data set that triggers this problem. The interesting thing is that it happens inside the last Map task, not in the reducer tasks. As you can see above the mapper cmd is rather on the simple side. > How many reducers are you running? Are you using the 0.17 streaming > jar? Are you running with the default comparator/partitioner? If you > run the same job as a Java sort, do you see the same behavior? -C I have no Java implementation of my job, sorry. Andreas Hadoop job_200805291303_0088 on ec2-67-202-58-97 User: hadoop Job Name: streamjob51857.jar Job File: /mnt/tmp/hadoop-hadoop/mapred/system/job_200805291303_0088/job.xml Status: Failed Started at: Mon Jun 02 16:11:29 GMT 2008 Failed at: Mon Jun 02 16:13:34 GMT 2008 Failed in: 2mins, 5sec Kind % Complete Num Tasks Pending Running Complete Killed Failed/Killed Task Attempts map 98.61% 72 0 0 71 1 4 / 11 reduce 100.00% 88 0 0 0 88 0 / 22 Counter Map Reduce Total File Systems Local bytes written 2,790,820,175 107,780,646 2,898,600,821 HDFS bytes read 2,633,043,249 0 2,633,043,249 Job Counters Failed map tasks 0 0 1 Launched map tasks 0 0 86 Launched reduce tasks 0 0 22 Data-local map tasks 0 0 69 Rack-local map tasks 0 0 5 Map-Reduce Framework Map input records 12,148,547 0 12,148,547 Map output records 12,148,547 0 12,148,547 Map input bytes 2,633,043,249 0 2,633,043,249 Map output bytes 2,645,311,659 0 2,645,311,659 Combine input records 0 0 0 Combine output records 0 0 0 Reduce input groups 0 0 0 Reduce input records 0 0 0 Reduce output records 0 0 0 Map Completion Graph - close Reduce Completion Graph - close Change priority from NORMAL to: VERY_HIGH HIGH LOW VERY_LOW Go back to JobTracker Hadoop, 2008.
