Re: OOM error with large # of map tasks

Jason Venner Thu, 01 May 2008 13:52:29 -0700

We have a problem with this in our application, in particular sometimesthreads started by the map/reduce class block the tasktracker$childprocess from exiting when the map/reduce is done.JMX is the number 1 cause of this for us, Badly behaving JNI tasks is#2, MINA is #3

We modify the tasktracker$child main to System.exit when done, and thissolves a very large set of this OOM's for us. The JNI tasks can run themachine OOM.


(our jni tasks have a 2.5gig working set each - don't ask...)

Devaraj Das wrote:

Long term we need to see how we can minimize the memory consumption by
objects corresponding to completed tasks in the tasktracker.
-----Original Message-----
From: Devaraj Das [mailto:[EMAIL PROTECTED]Sent: Friday, May 02, 2008 1:29 AM
To: 'core-user@hadoop.apache.org'
Subject: RE: OOM error with large # of map tasks
Hi Lili, sorry that I missed one important detail in my lastresponse - tasks that complete successfully on tasktrackersare marked as COMMIT_PENDING by the tasktracker itself. TheJobTracker takes those COMMIT_PENDING tasks, promotes theiroutput (if applicable), and then marks them as SUCCEEDED.However, tasktrackers are not notified about these and thestate of the tasks in the tasktrackers don't change, i.e.,they remain in COMMIT_PENDING state. In short, COMMIT_PENDINGat the tasktracker's end doesn't necessarily mean the job is stuck.
The tasktracker keeps in its memory the objects correspondingto tasks it runs. Those objects are purged on jobcompletion/failure only. This explains why you see so manytasks in the COMMIT_PENDING state. I believe it will createone jobconf for every task it launches.
I am only concerned about the memory consumption by thejobconf objects. As per your report, it is ~1.6 MB per jobconf.You could try things out with an increased heap size for thetasktrackers/tasks. You could increase the heap size for thetasktracker by changing the value of HADOOP_HEAPSIZE inhadoop-env.sh, and the tasks' heap size can be increased bytweaking the value of mapred.child.java.opts in thehadoop-site.xml for your job.
-----Original Message-----
From: Lili Wu [mailto:[EMAIL PROTECTED]
Sent: Thursday, May 01, 2008 4:19 AM
To: core-user@hadoop.apache.org
Subject: Re: OOM error with large # of map tasks

Hi Devaraj,

We don't have any special configuration on the job conf...
We only allow 3 map tasks and 3 reduce tasks in *one* node at anytime. So we are puzzled why there are 572 job confs on*one* node? From the heap dump, we see there are 569 MapTask and 3ReduceTask, (and that corresponds to 1138 MapTaskStatus and 6ReduceTaskStatus.)
We *think* many Map tasks were stuck in COMMIT_PENDING
stage, because
in heap dump, we saw a lot of MapTaskStatus objects being in either"UNASSIGNED" or "COMMIT_PENDING" state (the runState variable inMapTaskStatus). Then we took a look at another node on UIjust now, for agiven task tracker, under "Non-runnign tasks", there are at
least 200
or 300 COMMIT_PENDING tasks.  It appears they stuck too.

Thanks a lot for your help!

Lili
On Wed, Apr 30, 2008 at 2:14 PM, Devaraj Das <[EMAIL PROTECTED]>wrote:
Hi Lili, the jobconf memory consumption seems quite high.
Could you
please let us know if you pass anything in the jobconf of
jobs that
you run? I think you are seeing the 572 objects since a job
is running
and the TaskInProgress objects for tasks of the running job
are kept
in memory (but I need to double check this).
Regarding COMMIT_PENDING, yes it means that tasktracker has
finished
executing the task but the jobtracker hasn't committed the
output yet.
In
0.16 all tasks have to necessarily take the transition from
RUNNING->COMMIT_PENDING->SUCCEEDED. This behavior has been
improved in
0.17
(hadoop-3140) to include only tasks that generate output,
i.e., a task
is marked as SUCCEEDED if it doesn't generate any output in
its output path.
Devaraj
-----Original Message-----
From: Lili Wu [mailto:[EMAIL PROTECTED]
Sent: Thursday, May 01, 2008 2:09 AM
To: core-user@hadoop.apache.org
Cc: [EMAIL PROTECTED]
Subject: OOM error with large # of map tasks

We are using hadoop 0.16 and are seeing a consistent problem:
 out of memory errors when we have a large # of map tasks.
The specifics of what is submitted when we reproduce this:

three large jobs:
1. 20,000 map tasks and 10 reduce tasks 2. 17,000 map
tasks and 10
reduce tasks 3. 10,000 map tasks and 10 reduce tasks

these are at normal priority and periodically we swap the
priorities
around to get some tasks started by each and let them complete.
other smaller jobs come  and go every hour or so (no more
than 200
map tasks, 4-10 reducers).

Our cluster consists of 23 nodes and we have 69 map tasks and
69 reduce tasks.
Eventually, we see consistent oom errors in the task
logs and the
task tracker itself goes down on as many as 14 of our nodes.

We examined a heap dump after one of these crashes of a
TaskTracker
and found something interesting--there were 572 instances ofJobConf's that
accounted for 940mb of String objects.   This seems quite odd
that there are
so many instances of JobConf. It seems to correlate
with task in
the COMMIT_PENDING state as shown on the status for a
task tracker
node.  Has anyone observed something like this?
can anyone explain what would cause tasks to remain in
this state?
(which also apparently is in-memory vs
serialized to disk...).   In general, what does
COMMIT_PENDING mean?  (job
done, but output not committed to dfs?)

Thanks!

--
Jason Venner
Attributor - Program the Web <http://www.attributor.com/>

Attributor is hiring Hadoop Wranglers and coding wizards, contact ifinterested

Re: OOM error with large # of map tasks

Reply via email to