[
https://issues.apache.org/jira/browse/HADOOP-4035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12644096#action_12644096
]
Owen O'Malley commented on HADOOP-4035:
---------------------------------------
I guess I'm ok with it as a delta from total virtual memory, although how to
detect the virtual memory in a generic manner is an interesting question. Maybe
as I proposed over in HADOOP-4523, we need a plugin that could provide
OS-specific/site functionality.
Note that if we are using virtual memory, then we absolutely need a different
configuration for the amount of virtual memory that we'd like to schedule to.
We do not *want* the scheduler to put 4 10G tasks on a machine with 8G ram and
32G swap. That number should be based on RAM. So, I'd propose that we extend
the plugin interface as:
{code}
public abstract class MemoryPlugin {
public abstract long getVirtualMemorySize(Configuration conf);
public abstract long getRamSize(Configuration conf);
}
{code}
I'd propose that these values be the real values and that we have a configured
offset for both values.
mapred.tasktracker.virtualmemory.reserved (subtracted off of virtual memory)
mapred.tasktracker.memory.reserved (subtracted off of physical ram, before
reporting to JT)
Jobs should then define a soft and hard limit for their memory usage. If a task
goes over the hard limit, it should be killed immediately.
The scheduler should only allocate tasks if
sum(soft limits of tasks) <= TT ram
sum(hard limits of tasks) <= TT virtual memory
Thoughts?
> Modify the capacity scheduler (HADOOP-3445) to schedule tasks based on memory
> requirements and task trackers free memory
> ------------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-4035
> URL: https://issues.apache.org/jira/browse/HADOOP-4035
> Project: Hadoop Core
> Issue Type: Bug
> Components: contrib/capacity-sched
> Affects Versions: 0.19.0
> Reporter: Hemanth Yamijala
> Assignee: Vinod K V
> Priority: Blocker
> Fix For: 0.20.0
>
> Attachments: 4035.1.patch, HADOOP-4035-20080918.1.txt,
> HADOOP-4035-20081006.1.txt, HADOOP-4035-20081006.txt, HADOOP-4035-20081008.txt
>
>
> HADOOP-3759 introduced configuration variables that can be used to specify
> memory requirements for jobs, and also modified the tasktrackers to report
> their free memory. The capacity scheduler in HADOOP-3445 should schedule
> tasks based on these parameters. A task that is scheduled on a TT that uses
> more than the default amount of memory per slot can be viewed as effectively
> using more than one slot, as it would decrease the amount of free memory on
> the TT by more than the default amount while it runs. The scheduler should
> make the used capacity account for this additional usage while enforcing
> limits, etc.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.