[
https://issues.apache.org/jira/browse/HADOOP-4035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12655190#action_12655190
]
Vinod K V commented on HADOOP-4035:
-----------------------------------
The list of tests that I ran on a real cluster:
- Disable TT memory management and memory based scheduling. Run a sleep job.
Job should run successfully without any tasks failing (irrespective of its
virtual memory usage).
- Disable TT memory management and memory based scheduling. Run a sleep job
after configuring its mapred.task.maxvmem and mapred.task.maxpmem to large
values. Job should run successfully without any tasks failing.
Enable both TT memory management and scheduling based on memory for the
remaining tests.
- Verifying invalid jobs: Configure a sleep job such that job's
mapred.task.maxvmem oversteps cluster-wide mapred.task.limit.maxvmem. Job
should be rejected by the scheduler.
- Verifying invalid jobs: Configure a sleep job such that job's
mapred.task.maxpmem oversteps cluster-wide mapred.task.limit.maxpmem. Job
should be rejected by the scheduler.
- Verifying default values: Submit a sleep job without any memory related
configuration. It should take up the cluster-wide values specified via
mapred.task.default.maxvmem and mapred.task.default-pmem-percentage-in-vmem.
This should be verified by looking at the job.xml from JT UI.
The following currently work only on Linux. mapred.task.limit.maxvmem and
mapred.task.limit.maxpmem should be considerably high so as to not interfere
with jobs.
- Verifying vmem based scheduling: Start cluster with 1 map and 1 reduce slots
on all TTs. First submit a sleep job *job1* with map tasks occupying the whole
cluster and only one reduce task. Job's mapred.task.maxvmem should be such that
it is less than (TT total vmem - mapred.tasktracker.vmem.reserved). TT total
vmem can be calculated by running "cat /proc/meminfo" and adding up MemTotal
and SwapTotal. While the tasks of this job are still running, submit another
sleep job *job2* with one map tasks and reduce tasks occupying the whole
cluster with the same mapred.task.maxvmem. This job should be blocked by the
cluster till tasks of the first job start finishing. Now submit *job3*
identical to job1. This should be blocked till tasks of job2 start finishing.
- Verifying pmem based scheduling: Start cluster with 1 map and 1 reduce slots
on all TTs. First submit a sleep job *job1* with map tasks occupying the whole
cluster and only one reduce task. Job's mapred.task.maxpmem should be such that
it is less than (TT total pmem - mapred.tasktracker.pmem.reserved). TT total
pmem can be obtained from the MemTotal field of the output of "cat
/proc/meminfo". While the tasks of this job are still running, submit another
sleep job *job2* with one map tasks and reduce tasks occupying the whole
cluster with the same mapred.task.maxpmem. This job should be blocked by the
cluster till tasks of the first job start finishing. Now submit *job3*
identical to job1. This should be blocked till tasks of job2 start finishing.
> Modify the capacity scheduler (HADOOP-3445) to schedule tasks based on memory
> requirements and task trackers free memory
> ------------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-4035
> URL: https://issues.apache.org/jira/browse/HADOOP-4035
> Project: Hadoop Core
> Issue Type: Bug
> Components: contrib/capacity-sched
> Affects Versions: 0.19.0
> Reporter: Hemanth Yamijala
> Assignee: Vinod K V
> Priority: Blocker
> Fix For: 0.20.0
>
> Attachments: 4035.1.patch, HADOOP-4035-20080918.1.txt,
> HADOOP-4035-20081006.1.txt, HADOOP-4035-20081006.txt,
> HADOOP-4035-20081008.txt, HADOOP-4035-20081121.txt,
> HADOOP-4035-20081126.1.txt, HADOOP-4035-20081128-4.txt,
> HADOOP-4035-20081202.1.txt, HADOOP-4035-20081202.2.txt,
> HADOOP-4035-20081202.txt
>
>
> HADOOP-3759 introduced configuration variables that can be used to specify
> memory requirements for jobs, and also modified the tasktrackers to report
> their free memory. The capacity scheduler in HADOOP-3445 should schedule
> tasks based on these parameters. A task that is scheduled on a TT that uses
> more than the default amount of memory per slot can be viewed as effectively
> using more than one slot, as it would decrease the amount of free memory on
> the TT by more than the default amount while it runs. The scheduler should
> make the used capacity account for this additional usage while enforcing
> limits, etc.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.