Aniket, I wonder if these tasks were run as Speculative Execution. Have you been able to determine whether the job runs successfully? HTH Cliff
On Thu, Sep 23, 2010 at 12:52 AM, aniket ray <aniket....@gmail.com> wrote: > Hi, > > I continuously run a series of batch job using Hadoop Map Reduce. I also > have a managing daemon that moves data around on the hdfs making way for > more jobs to be run. > I use capacity scheduler to schedule many jobs in parallel. > > I see an issue on the Hadoop web monitoring UI at port 50030 which I > believe > may be causing a performance bottleneck and wanted to get more information. > > Approximately 10% of the reduce tasks show up as "Killed" in the UI. The > logs say that the killed tasks are in the shuffle phase when they are > killed > but the logs don't show any exception. > My understanding is that these killed tasks would be started again and this > slows down the whole hadoop job. > I was wondering what the possible issues maybe and how to debug this issue? > > I have tried on both the hadoop 0.20.2 and the latest version of hadoop > from > yahoo's github. > I've monitored the nodes and there is a lot of free disk space and memory > on > all nodes (more than 1 TB free disk and 5 GB free memory at all times on > all > nodes). > > Since there are no exceptions and any other visible issues, I am finding it > hard to figure out what the problem might be. Could anybody help? > > Thanks, > -aniket >