[ 
https://issues.apache.org/jira/browse/HADOOP-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12632088#action_12632088
 ] 

Amar Kamat commented on HADOOP-4018:
------------------------------------

Owen,
I think Dhruba's concern here is of many small/avg sized jobs collectively 
overloading the jobtracker, see 
[here|https://issues.apache.org/jira/browse/HADOOP-4018?focusedCommentId=12625505#action_12625505].
 Capping individual jobs might not help as all the jobs will accumulate in JT's 
memory and bring it down. I think some kind of local capping, global capping 
and smart scheduling/initialization might help. But I agree that in the long 
term we need to model the memory better but for now simple heuristics might 
work.

> limit memory usage in jobtracker
> --------------------------------
>
>                 Key: HADOOP-4018
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4018
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: maxSplits.patch, maxSplits2.patch, maxSplits3.patch, 
> maxSplits4.patch, maxSplits5.patch, maxSplits6.patch, maxSplits7.patch
>
>
> We have seen instances when a user submitted a job with many thousands of 
> mappers. The JobTracker was running with 3GB heap, but it was still not 
> enough to prevent memory trashing from Garbage collection; effectively the 
> Job Tracker was not able to serve jobs and had to be restarted.
> One simple proposal would be to limit the maximum number of tasks per job. 
> This can be a configurable parameter. Is there other things that eat huge 
> globs of memory in job Tracker?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to