Re: Shuffle tasks getting killed

cliff palmer Thu, 23 Sep 2010 04:45:04 -0700

Aniket, I wonder if these tasks were run as Speculative Execution.  Have you
been able to determine whether the job runs successfully?
HTH
Cliff


On Thu, Sep 23, 2010 at 12:52 AM, aniket ray <aniket....@gmail.com> wrote:

> Hi,
>
> I continuously run a series of batch job using Hadoop Map Reduce. I also
> have a managing daemon that moves data around on the hdfs making way for
> more jobs to be run.
> I use capacity scheduler to schedule many jobs in parallel.
>
> I see an issue on the Hadoop web monitoring UI at port 50030 which I
> believe
> may be causing a performance bottleneck and wanted to get more information.
>
> Approximately 10% of the reduce tasks show up as "Killed" in the UI. The
> logs say that the killed tasks are in the shuffle phase when they are
> killed
> but the logs don't show any exception.
> My understanding is that these killed tasks would be started again and this
> slows down the whole hadoop job.
> I was wondering what the possible issues maybe and how to debug this issue?
>
> I have tried on both the hadoop 0.20.2 and the latest version of hadoop
> from
> yahoo's github.
> I've monitored the nodes and there is a lot of free disk space and memory
> on
> all nodes (more than 1 TB free disk and 5 GB free memory at all times on
> all
> nodes).
>
> Since there are no exceptions and any other visible issues, I am finding it
> hard to figure out what the problem might be. Could anybody help?
>
> Thanks,
> -aniket
>

Re: Shuffle tasks getting killed

Reply via email to