Re: How to debug reducer thread?

2010-04-27 Thread Eric Sammer
If you want to step through a full map / reduce job, the easiest way to do this is to run a job using the local job runner in your IDE. The local job runner will run the MR job in a single thread making it very easy to debug. You will want to use the local file system and a small amount of data dur

Re: Separate communications of HDFS and MapReduce

2010-04-27 Thread Druilhe Remi
Thanks for your answer :) Allen Wittenauer a écrit : > On Apr 26, 2010, at 6:23 AM, Druilhe Remi wrote: > >> For example, when I run "wordcount" example, there is HDFS communications >> and MapReduce communications and I am not able to distinguish which packet >> belong to HDFS or to MapReduc

Re: Number of map tasks spawned

2010-04-27 Thread Hemanth Yamijala
On Sun, Apr 25, 2010 at 4:28 PM, Praveen Sripati wrote: > > Hi, > > The MapReduce tutorial specifies that > >>> The Hadoop Map/Reduce framework spawns one map task for each InputSplit >>> generated by the InputFormat for the job. > > But, the mapred.map.tasks definition is > >>> The default number

Re: Question about jobtracker metrics

2010-04-27 Thread Hemanth Yamijala
On Tue, Apr 27, 2010 at 12:43 AM, Harold Lim wrote: > Hi All, > > I was looking at the jobtracker metrics and it seems to be able to give me: > jobs_completed, jobs_submitted, maps_completed, maps_launched, > reduces_completed, reduces_launched. > > I was wondering what maps launched mean? Is thi

How to debug reducer thread?

2010-04-27 Thread psdc1978
Hi, The reduce tasks are threads that are launched by the Reducer. The print below shows the stacktrace of one reduce task. at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.fetchHashesOutputs(ReduceTask.java:2582) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:395) at org.apache.ha