Fair scheduler fairness question

2010-03-10 Thread Neo Anderson
I am learning how fair scheduler manage the jobs to allow each job share 
resource over time; but don't know if my understanding is correct or not. 

My scenario is that I have 3 data nodes and the cluster is configured using 
fair scheduler with three pools launched (e.g. A, B, C). Each pool is 
configured with 'maxRunningJobs1/maxRunningJobs.' Now the clients try to 
submit 4 jobs (e.g. submitjob()) to 3 differt pools. For instance, 

the first job is submitted to pool A
the second job is submitted to pool B
the third job is submitted to pool B
the fourth job is submitted to pool C

So I expect that the first 3 jobs will occupy the free slots (the slots should 
be fool now.) Then the fourth job is submitted. But since the slots are full, 
and the fourth job should also have a slot executing its job; therefore, the 
third job will be terminated (or kill) so that the fourth job can be launched. 

Is my scenario correct? 
And if I am right, is there any key word searchable in the log to observe such 
activites (the job that is being killed e.g. the third job)?

Thanks for help. 
I apprecaite any advice.












Re: Fair scheduler fairness question

2010-03-10 Thread Allen Wittenauer
On 3/10/10 7:38 AM, Neo Anderson javadeveloper...@yahoo.co.uk wrote:

 I am learning how fair scheduler manage the jobs to allow each job share
 resource over time; but don't know if my understanding is correct or not.
 
 My scenario is that I have 3 data nodes and the cluster is configured using
 fair scheduler with three pools launched (e.g. A, B, C). Each pool is
 configured with 'maxRunningJobs1/maxRunningJobs.' Now the clients try to
 submit 4 jobs (e.g. submitjob()) to 3 differt pools. For instance,
 
 the first job is submitted to pool A
 the second job is submitted to pool B
 the third job is submitted to pool B
 the fourth job is submitted to pool C
 
 So I expect that the first 3 jobs will occupy the free slots (the slots should
 be fool now.) Then the fourth job is submitted. But since the slots are full,
 and the fourth job should also have a slot executing its job; therefore, the
 third job will be terminated (or kill) so that the fourth job can be launched.
 
 Is my scenario correct?
 And if I am right, is there any key word searchable in the log to observe such
 activites (the job that is being killed e.g. the third job)?


A lot of it depends upon timing.  If there is a long enough pause between
job 1 and job 2, job 1 will take every slot available to it.  As job 1's
slots finish, job 2 and 4 would get those slots.   As job 2 finishes, job 3
will get its slots.

Slots are only freed by force if the scheduler you are using has
pre-emption.  I think some versions of fair share may have it.  Entire jobs
are never killed.





Re: Fair scheduler fairness question

2010-03-10 Thread Neo Anderson


--- On Wed, 10/3/10, Allen Wittenauer awittena...@linkedin.com wrote:

 From: Allen Wittenauer awittena...@linkedin.com
 Subject: Re: Fair scheduler fairness question
 To: common-user@hadoop.apache.org
 Date: Wednesday, 10 March, 2010, 16:06
 On 3/10/10 7:38 AM, Neo Anderson
 javadeveloper...@yahoo.co.uk
 wrote:
 
  I am learning how fair scheduler manage the jobs to
 allow each job share
  resource over time; but don't know if my understanding
 is correct or not.
  
  My scenario is that I have 3 data nodes and the
 cluster is configured using
  fair scheduler with three pools launched (e.g. A, B,
 C). Each pool is
  configured with
 'maxRunningJobs1/maxRunningJobs.' Now the
 clients try to
  submit 4 jobs (e.g. submitjob()) to 3 differt pools.
 For instance,
  
  the first job is submitted to pool A
  the second job is submitted to pool B
  the third job is submitted to pool B
  the fourth job is submitted to pool C
  
  So I expect that the first 3 jobs will occupy the free
 slots (the slots should
  be fool now.) Then the fourth job is submitted. But
 since the slots are full,
  and the fourth job should also have a slot executing
 its job; therefore, the
  third job will be terminated (or kill) so that the
 fourth job can be launched.
  
  Is my scenario correct?
  And if I am right, is there any key word searchable in
 the log to observe such
  activites (the job that is being killed e.g. the third
 job)?
 
 
 A lot of it depends upon timing.  If there is a long
 enough pause between
 job 1 and job 2, job 1 will take every slot available to
 it.  As job 1's
 slots finish, job 2 and 4 would get those
 slots.   As job 2 finishes, job 3
 will get its slots.
 
 Slots are only freed by force if the scheduler you are
 using has
 pre-emption.  I think some versions of fair share may
 have it.  Entire jobs
 are never killed.
 
 
 
 

At the moment I use hadoop 0.20.2 and I can not find code that relates to 
'preempt' function; however, I read the jira MAPREDUCE-551 saying preempt 
function is already been fixed at version 0.20.0. Also, I can find some 
functons that relates to 'preemption' e.g. 'protected void 
preemptTasksIfNecessary()' in the patch. I am confused now - which function in 
version 0.20.2 (or 0.20.1) is used to preempt unnecessary tasks (so that slots 
can be freed for other tasks/ jobs to run)?

Thanks you for your help. 












Re: Fair scheduler fairness question

2010-03-10 Thread Allen Wittenauer



On 3/10/10 9:14 AM, Neo Anderson javadeveloper...@yahoo.co.uk wrote:
 At the moment I use hadoop 0.20.2 and I can not find code that relates to
 'preempt' function; however, I read the jira MAPREDUCE-551 saying preempt
 function is already been fixed at version 0.20.0.

MR-551 says fixed in 0.21 at the top.  Reading the text shows that patches
are available if you want to patch your own build of 0.20.



Re: Fair scheduler fairness question

2010-03-10 Thread Todd Lipcon
On Wed, Mar 10, 2010 at 9:18 AM, Allen Wittenauer
awittena...@linkedin.comwrote:




 On 3/10/10 9:14 AM, Neo Anderson javadeveloper...@yahoo.co.uk wrote:
  At the moment I use hadoop 0.20.2 and I can not find code that relates to
  'preempt' function; however, I read the jira MAPREDUCE-551 saying preempt
  function is already been fixed at version 0.20.0.

 MR-551 says fixed in 0.21 at the top.  Reading the text shows that patches
 are available if you want to patch your own build of 0.20.


If you'd rather not patch your own build of Hadoop, the fair scheduler
preemption feature is also available in CDH2:
http://archive.cloudera.com/cdh/2/hadoop-0.20.1+169.56.tar.gz

-Todd


-- 
Todd Lipcon
Software Engineer, Cloudera