[ https://issues.apache.org/jira/browse/MAPREDUCE-1221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Scott Chen updated MAPREDUCE-1221: ---------------------------------- Attachment: MAPREDUCE-1221-v1.patch The patch allows us to set an amount of memory that will not be used to run tasks. If this limit is violated, the task uses the highest amount of memory will be killed. Ex: Configure mapreduce.tasktracker.reserved.physicalmemory.mb=3072 If there's a TaskTracker with 16GB of memory and currently the tasks are using 14GB of memory, then we have 16GB-14GB < 3GB. In this case, TaskMemoryManagerThread will kill the task uses the highest amount of memory. Note that if the value is not configured, this policy will not triggered. Killing tasks will slow down the job because the task has to be scheduled again. But if the task is not killed in this case, it is very likely that it will failed on this node because the node has no memory to run it. Also the node might crashed because of this. We choose the highest memory-consuming task to kill because it is likely that it is the bad job that's causing the problem. A part of this patch is done by Dhruba. He has sent me his half-done patch and I continued from there. > Kill tasks on a node if the free physical memory on that machine falls below > a configured threshold > --------------------------------------------------------------------------------------------------- > > Key: MAPREDUCE-1221 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1221 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: tasktracker > Reporter: dhruba borthakur > Assignee: Scott Chen > Attachments: MAPREDUCE-1221-v1.patch > > > The TaskTracker currently supports killing tasks if the virtual memory of a > task exceeds a set of configured thresholds. I would like to extend this > feature to enable killing tasks if the physical memory used by that task > exceeds a certain threshold. > On a certain operating system (guess?), if user space processes start using > lots of memory, the machine hangs and dies quickly. This means that we would > like to prevent map-reduce jobs from triggering this condition. From my > understanding, the killing-based-on-virtual-memory-limits (HADOOP-5883) were > designed to address this problem. This works well when most map-reduce jobs > are Java jobs and have well-defined -Xmx parameters that specify the max > virtual memory for each task. On the other hand, if each task forks off > mappers/reducers written in other languages (python/php, etc), the total > virtual memory usage of the process-subtree varies greatly. In these cases, > it is better to use kill-tasks-using-physical-memory-limits. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.