[jira] Commented: (HADOOP-3759) Provide ability to run memory intensive jobs without affecting other running tasks on the nodes

Hemanth Yamijala (JIRA) Tue, 22 Jul 2008 10:27:52 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-3759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12615687#action_12615687
 ]


Hemanth Yamijala commented on HADOOP-3759:
------------------------------------------

Some more details on implementation:
- HADOOP-3581 proposes the configuration parameters introduced to specify the 
maximum amount of RAM allowed for all tasks on a TaskTracker and memory per 
task of a job. The per job limit is defined in the JobConf of the job, and the 
maximum amount of RAM is defined in the JobConf of the tasktracker. (Refer 
comments 
[here|https://issues.apache.org/jira/browse/HADOOP-3581?focusedCommentId=12614295#action_12614295]
 and 
[here|https://issues.apache.org/jira/browse/HADOOP-3581?focusedCommentId=12615679#action_12615679])
- In {{TaskTracker.transmitHeartbeat()}}, we compute the amount of free virtual 
memory as max allowed RAM - sum (max memory per task, for all tasks running on 
the node)
- In {{TaskTrackerStatus.java}}, we define a map of key-value pairs to transmit 
such resource information to the JobTracker. This is to allow additional 
resources to be added as time goes, without needing to change the wire 
protocol. However, for simplicity, we can provide accessor methods in 
TaskTrackerStatus to get/set these key-value pairs. Something like:
{code}
Map<String, Long> resourceInfo;
public long getFreeVirtualMemory() {
  return resourceInfo.get("memory").longValue();
}
public long setFreeVirtualMemory(long freeVirtualMemory) {
  resourceInfo.set("memory", new Long(freeVirtualMemory));
}
{code}
- In {{JobInProgress}}, we define APIs to get the configured virtual memory 
requirements for a job. This will read from the jobconf of the Job.
- Using these, any scheduler (such as HADOOP-3445) can match memory 
requirements of a Job with the reported resource info in the task tracker and 
take scheduling decisions.

Comments ?

> Provide ability to run memory intensive jobs without affecting other running 
> tasks on the nodes
> -----------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-3759
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3759
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Hemanth Yamijala
>            Assignee: Hemanth Yamijala
>             Fix For: 0.19.0
>
>
> In HADOOP-3581, we are discussing how to prevent memory intensive tasks from 
> affecting Hadoop daemons and other tasks running on a node. A related 
> requirement is that users be provided an ability to run jobs which are memory 
> intensive. The system must provide enough knobs to allow such jobs to be run 
> while still maintaining the requirements of HADOOP-3581.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-3759) Provide ability to run memory intensive jobs without affecting other running tasks on the nodes

Reply via email to