Re: OOM error with large # of map tasks

sam rash Thu, 01 May 2008 14:52:47 -0700

Hi,
In fact we verified it is our jobconf--we have about 800k in input paths
(11k files for a few TB of data).
We'll indeed up the heap size to about 2048m and we can also do some
significant optimizations on the file paths (use wildcards and others).


Is there any plan to make the storage of the JobConf objects more
memory-efficient?  perhaps they can be serialized to disk or if the jobconf
doesn't change per task (ie it's inherited and not changed), why not keep
one per job in a tasktracker?  (or if it does change, 'share' the common
parts?).  This would greatly help us we have 3 jobs of 20k tasks and if one
gets halfway and we bump another job up, we end up with 1000s of complete
tasks (but no complete jobs) per tasktracker.  Even with our trimming of our
jobconf object and increasing the heap size, we'll hit a limit pretty quick.

thx,
-sr

On Thu, May 1, 2008 at 12:58 PM, Devaraj Das <[EMAIL PROTECTED]> wrote:

> Hi Lili, sorry that I missed one important detail in my last response -
> tasks that complete successfully on tasktrackers are marked as
> COMMIT_PENDING by the tasktracker itself. The JobTracker takes those
> COMMIT_PENDING tasks, promotes their output (if applicable), and then
> marks
> them as SUCCEEDED. However, tasktrackers are not notified about these and
> the state of the tasks in the tasktrackers don't change, i.e., they remain
> in COMMIT_PENDING state. In short, COMMIT_PENDING at the tasktracker's end
> doesn't necessarily mean the job is stuck.
>
> The tasktracker keeps in its memory the objects corresponding to tasks it
> runs. Those objects are purged on job completion/failure only. This
> explains
> why you see so many tasks in the COMMIT_PENDING state. I believe it will
> create one jobconf for every task it launches.
>
> I am only concerned about the memory consumption by the jobconf objects.
> As
> per your report, it is ~1.6 MB per jobconf.
>
> You could try things out with an increased heap size for the
> tasktrackers/tasks. You could increase the heap size for the tasktracker
> by
> changing the value of HADOOP_HEAPSIZE in hadoop-env.sh, and the tasks'
> heap
> size can be increased by tweaking the value of mapred.child.java.opts in
> the
> hadoop-site.xml for your job.
>
> > -----Original Message-----
> > From: Lili Wu [mailto:[EMAIL PROTECTED]
> > Sent: Thursday, May 01, 2008 4:19 AM
> > To: core-user@hadoop.apache.org
> > Subject: Re: OOM error with large # of map tasks
> >
> > Hi Devaraj,
> >
> > We don't have any special configuration on the job conf...
> >
> > We only allow 3 map tasks and 3 reduce tasks in *one* node at
> > any time.  So we are puzzled why there are 572 job confs on
> > *one* node?  From the heap dump, we see there are 569 MapTask
> > and 3 ReduceTask, (and that corresponds to 1138 MapTaskStatus
> > and 6 ReduceTaskStatus.)
> >
> > We *think* many Map tasks were stuck in COMMIT_PENDING stage,
> > because in heap dump, we saw a lot of MapTaskStatus objects
> > being in either "UNASSIGNED" or "COMMIT_PENDING" state (the
> > runState variable in
> > MapTaskStatus).   Then we took a look at another node on UI
> > just now,  for a
> > given task tracker, under "Non-runnign tasks", there are at
> > least 200 or 300 COMMIT_PENDING tasks.  It appears they stuck too.
> >
> > Thanks a lot for your help!
> >
> > Lili
> >
> >
> > On Wed, Apr 30, 2008 at 2:14 PM, Devaraj Das
> > <[EMAIL PROTECTED]> wrote:
> >
> > > Hi Lili, the jobconf memory consumption seems quite high. Could you
> > > please let us know if you pass anything in the jobconf of jobs that
> > > you run? I think you are seeing the 572 objects since a job
> > is running
> > > and the TaskInProgress objects for tasks of the running job
> > are kept
> > > in memory (but I need to double check this).
> > > Regarding COMMIT_PENDING, yes it means that tasktracker has
> > finished
> > > executing the task but the jobtracker hasn't committed the
> > output yet.
> > > In
> > > 0.16 all tasks have to necessarily take the transition from
> > > RUNNING->COMMIT_PENDING->SUCCEEDED. This behavior has been
> > improved in
> > > 0.17
> > > (hadoop-3140) to include only tasks that generate output,
> > i.e., a task
> > > is marked as SUCCEEDED if it doesn't generate any output in
> > its output path.
> > >
> > > Devaraj
> > >
> > > > -----Original Message-----
> > > > From: Lili Wu [mailto:[EMAIL PROTECTED]
> > > > Sent: Thursday, May 01, 2008 2:09 AM
> > > > To: core-user@hadoop.apache.org
> > > > Cc: [EMAIL PROTECTED]
> > > > Subject: OOM error with large # of map tasks
> > > >
> > > > We are using hadoop 0.16 and are seeing a consistent problem:
> > > >  out of memory errors when we have a large # of map tasks.
> > > > The specifics of what is submitted when we reproduce this:
> > > >
> > > > three large jobs:
> > > > 1. 20,000 map tasks and 10 reduce tasks 2. 17,000 map
> > tasks and 10
> > > > reduce tasks 3. 10,000 map tasks and 10 reduce tasks
> > > >
> > > > these are at normal priority and periodically we swap the
> > priorities
> > > > around to get some tasks started by each and let them complete.
> > > > other smaller jobs come  and go every hour or so (no more
> > than 200
> > > > map tasks, 4-10 reducers).
> > > >
> > > > Our cluster consists of 23 nodes and we have 69 map tasks and
> > > > 69 reduce tasks.
> > > > Eventually, we see consistent oom errors in the task logs and the
> > > > task tracker itself goes down on as many as 14 of our nodes.
> > > >
> > > > We examined a heap dump after one of these crashes of a
> > TaskTracker
> > > > and found something interesting--there were 572 instances of
> > > > JobConf's that
> > > > accounted for 940mb of String objects.   This seems quite odd
> > > > that there are
> > > > so many instances of JobConf.  It seems to correlate with task in
> > > > the COMMIT_PENDING state as shown on the status for a
> > task tracker
> > > > node.  Has anyone observed something like this?
> > > > can anyone explain what would cause tasks to remain in
> > this state?
> > > > (which also apparently is in-memory vs
> > > > serialized to disk...).   In general, what does
> > > > COMMIT_PENDING mean?  (job
> > > > done, but output not committed to dfs?)
> > > >
> > > > Thanks!
> > > >
> > >
> > >
> >
>
>

Re: OOM error with large # of map tasks

Reply via email to