On 3/10/10 7:38 AM, "Neo Anderson" <javadeveloper...@yahoo.co.uk> wrote:
> I am learning how fair scheduler manage the jobs to allow each job share > resource over time; but don't know if my understanding is correct or not. > > My scenario is that I have 3 data nodes and the cluster is configured using > fair scheduler with three pools launched (e.g. A, B, C). Each pool is > configured with '<maxRunningJobs>1</maxRunningJobs>.' Now the clients try to > submit 4 jobs (e.g. submitjob()) to 3 differt pools. For instance, > > the first job is submitted to pool A > the second job is submitted to pool B > the third job is submitted to pool B > the fourth job is submitted to pool C > > So I expect that the first 3 jobs will occupy the free slots (the slots should > be fool now.) Then the fourth job is submitted. But since the slots are full, > and the fourth job should also have a slot executing its job; therefore, the > third job will be terminated (or kill) so that the fourth job can be launched. > > Is my scenario correct? > And if I am right, is there any key word searchable in the log to observe such > activites (the job that is being killed e.g. the third job)? A lot of it depends upon timing. If there is a long enough pause between job 1 and job 2, job 1 will take every slot available to it. As job 1's slots finish, job 2 and 4 would get those slots. As job 2 finishes, job 3 will get its slots. Slots are only freed by force if the scheduler you are using has pre-emption. I think some versions of fair share may have it. Entire jobs are never killed.