We have a problem with this in our application, in particular sometimes threads started by the map/reduce class block the tasktracker$child process from exiting when the map/reduce is done. JMX is the number 1 cause of this for us, Badly behaving JNI tasks is #2, MINA is #3

We modify the tasktracker$child main to System.exit when done, and this solves a very large set of this OOM's for us. The JNI tasks can run the machine OOM.

(our jni tasks have a 2.5gig working set each - don't ask...)

Devaraj Das wrote:
Long term we need to see how we can minimize the memory consumption by
objects corresponding to completed tasks in the tasktracker.
-----Original Message-----
From: Devaraj Das [mailto:[EMAIL PROTECTED] Sent: Friday, May 02, 2008 1:29 AM
To: 'core-user@hadoop.apache.org'
Subject: RE: OOM error with large # of map tasks

Hi Lili, sorry that I missed one important detail in my last response - tasks that complete successfully on tasktrackers are marked as COMMIT_PENDING by the tasktracker itself. The JobTracker takes those COMMIT_PENDING tasks, promotes their output (if applicable), and then marks them as SUCCEEDED. However, tasktrackers are not notified about these and the state of the tasks in the tasktrackers don't change, i.e., they remain in COMMIT_PENDING state. In short, COMMIT_PENDING at the tasktracker's end doesn't necessarily mean the job is stuck.

The tasktracker keeps in its memory the objects corresponding to tasks it runs. Those objects are purged on job completion/failure only. This explains why you see so many tasks in the COMMIT_PENDING state. I believe it will create one jobconf for every task it launches.

I am only concerned about the memory consumption by the jobconf objects. As per your report, it is ~1.6 MB per jobconf. You could try things out with an increased heap size for the tasktrackers/tasks. You could increase the heap size for the tasktracker by changing the value of HADOOP_HEAPSIZE in hadoop-env.sh, and the tasks' heap size can be increased by tweaking the value of mapred.child.java.opts in the hadoop-site.xml for your job.

-----Original Message-----
From: Lili Wu [mailto:[EMAIL PROTECTED]
Sent: Thursday, May 01, 2008 4:19 AM
To: core-user@hadoop.apache.org
Subject: Re: OOM error with large # of map tasks

Hi Devaraj,

We don't have any special configuration on the job conf...

We only allow 3 map tasks and 3 reduce tasks in *one* node at any time. So we are puzzled why there are 572 job confs on *one* node? From the heap dump, we see there are 569 MapTask and 3 ReduceTask, (and that corresponds to 1138 MapTaskStatus and 6 ReduceTaskStatus.)

We *think* many Map tasks were stuck in COMMIT_PENDING
stage, because
in heap dump, we saw a lot of MapTaskStatus objects being in either "UNASSIGNED" or "COMMIT_PENDING" state (the runState variable in MapTaskStatus). Then we took a look at another node on UI just now, for a given task tracker, under "Non-runnign tasks", there are at
least 200
or 300 COMMIT_PENDING tasks.  It appears they stuck too.

Thanks a lot for your help!

Lili


On Wed, Apr 30, 2008 at 2:14 PM, Devaraj Das <[EMAIL PROTECTED]> wrote:

Hi Lili, the jobconf memory consumption seems quite high.
Could you
please let us know if you pass anything in the jobconf of
jobs that
you run? I think you are seeing the 572 objects since a job
is running
and the TaskInProgress objects for tasks of the running job
are kept
in memory (but I need to double check this).
Regarding COMMIT_PENDING, yes it means that tasktracker has
finished
executing the task but the jobtracker hasn't committed the
output yet.
In
0.16 all tasks have to necessarily take the transition from
RUNNING->COMMIT_PENDING->SUCCEEDED. This behavior has been
improved in
0.17
(hadoop-3140) to include only tasks that generate output,
i.e., a task
is marked as SUCCEEDED if it doesn't generate any output in
its output path.
Devaraj

-----Original Message-----
From: Lili Wu [mailto:[EMAIL PROTECTED]
Sent: Thursday, May 01, 2008 2:09 AM
To: core-user@hadoop.apache.org
Cc: [EMAIL PROTECTED]
Subject: OOM error with large # of map tasks

We are using hadoop 0.16 and are seeing a consistent problem:
 out of memory errors when we have a large # of map tasks.
The specifics of what is submitted when we reproduce this:

three large jobs:
1. 20,000 map tasks and 10 reduce tasks 2. 17,000 map
tasks and 10
reduce tasks 3. 10,000 map tasks and 10 reduce tasks

these are at normal priority and periodically we swap the
priorities
around to get some tasks started by each and let them complete.
other smaller jobs come  and go every hour or so (no more
than 200
map tasks, 4-10 reducers).

Our cluster consists of 23 nodes and we have 69 map tasks and
69 reduce tasks.
Eventually, we see consistent oom errors in the task
logs and the
task tracker itself goes down on as many as 14 of our nodes.

We examined a heap dump after one of these crashes of a
TaskTracker
and found something interesting--there were 572 instances of JobConf's that
accounted for 940mb of String objects.   This seems quite odd
that there are
so many instances of JobConf. It seems to correlate
with task in
the COMMIT_PENDING state as shown on the status for a
task tracker
node.  Has anyone observed something like this?
can anyone explain what would cause tasks to remain in
this state?
(which also apparently is in-memory vs
serialized to disk...).   In general, what does
COMMIT_PENDING mean?  (job
done, but output not committed to dfs?)

Thanks!


--
Jason Venner
Attributor - Program the Web <http://www.attributor.com/>
Attributor is hiring Hadoop Wranglers and coding wizards, contact if interested

Reply via email to