Re: cluster under-utilization with Hadoop Fair Scheduler

2010-04-11 Thread Todd Lipcon
Hi Abhishek,

This behavior is improved by MAPREDUCE-706 I believe (not certain that
that's the JIRA, but I know it's fixed in trunk fairscheduler). These
patches are included in CDH3 (currently in beta)
http://archive.cloudera.com/cdh/3/

In general, though, map tasks that are so short are not going to be very
efficient - even with fast assignment there is some constant overhead per
task.

Thanks
-Todd

On Sun, Apr 11, 2010 at 11:42 AM, abhishek sharma absha...@usc.edu wrote:

 Hi all,

 I have been using the Hadoop Fair Scheduler for some experiments on a
 100 node cluster with 2 map slots per node (hence, a total of 200 map
 slots).

 In one of my experiments, all the map tasks finish within a heartbeat
 interval of 3 seconds. I noticed that the maximum number of
 concurrently
 active map slots on my cluster never exceeds 100, and hence, the
 cluster utilization during my experiments never exceeds 50% even when
 large jobs with more than a 1000 maps are being executed.

 A look at the Fair Scheduler code (in particular, the assignTasks
 function) revealed the reason.
 As per my understanding, with the implementation in Hadoop 0.20.0, a
 TaskTracker is not assigned more than 1 map and 1 reduce task per
 heart beat.

 In my experiments, in every heart beat, each TT has 2 free map slots
 but is assigned only 1 map task, and hence, the utilization never goes
 beyond 50%.

 Of course, this (degenerate) case does not arise when map tasks take
 more than one 1 heart beat interval to finish. For example, I repeated
 the experiments with maps tasks taking close to 15 s to finish and
 noticed close to 100 % utilization when large jobs were executing.

 Why does the Fair Scheduler not assign more than one map task to a TT
 per heart beat? Is this done to spread the load uniformly across the
 cluster?
 I looked at assignTasks function in the default Hadoop scheduler
 (JobQueueTaskScheduler.java), and it does assign more than 1 map task
 per heart beat to a TT.

 It will be easy to change the Fair Scheduler to assign more than 1 map
 task to a TT per heart beat (I did that and achieved 100% utilization
 even with small map tasks). But I am wondering, if doing so will
 violate some fairness properties.

 Thanks,
 Abhishek




-- 
Todd Lipcon
Software Engineer, Cloudera


Re: cluster under-utilization with Hadoop Fair Scheduler

2010-04-11 Thread Ted Yu
Reading assignTasks() in 0.20.2 reveals that the number of map tasks
assigned is not limited to 1 per heartbeat.

Cheers

On Sun, Apr 11, 2010 at 12:30 PM, Todd Lipcon t...@cloudera.com wrote:

 Hi Abhishek,

 This behavior is improved by MAPREDUCE-706 I believe (not certain that
 that's the JIRA, but I know it's fixed in trunk fairscheduler). These
 patches are included in CDH3 (currently in beta)
 http://archive.cloudera.com/cdh/3/

 In general, though, map tasks that are so short are not going to be very
 efficient - even with fast assignment there is some constant overhead per
 task.

 Thanks
 -Todd

 On Sun, Apr 11, 2010 at 11:42 AM, abhishek sharma absha...@usc.edu
 wrote:

  Hi all,
 
  I have been using the Hadoop Fair Scheduler for some experiments on a
  100 node cluster with 2 map slots per node (hence, a total of 200 map
  slots).
 
  In one of my experiments, all the map tasks finish within a heartbeat
  interval of 3 seconds. I noticed that the maximum number of
  concurrently
  active map slots on my cluster never exceeds 100, and hence, the
  cluster utilization during my experiments never exceeds 50% even when
  large jobs with more than a 1000 maps are being executed.
 
  A look at the Fair Scheduler code (in particular, the assignTasks
  function) revealed the reason.
  As per my understanding, with the implementation in Hadoop 0.20.0, a
  TaskTracker is not assigned more than 1 map and 1 reduce task per
  heart beat.
 
  In my experiments, in every heart beat, each TT has 2 free map slots
  but is assigned only 1 map task, and hence, the utilization never goes
  beyond 50%.
 
  Of course, this (degenerate) case does not arise when map tasks take
  more than one 1 heart beat interval to finish. For example, I repeated
  the experiments with maps tasks taking close to 15 s to finish and
  noticed close to 100 % utilization when large jobs were executing.
 
  Why does the Fair Scheduler not assign more than one map task to a TT
  per heart beat? Is this done to spread the load uniformly across the
  cluster?
  I looked at assignTasks function in the default Hadoop scheduler
  (JobQueueTaskScheduler.java), and it does assign more than 1 map task
  per heart beat to a TT.
 
  It will be easy to change the Fair Scheduler to assign more than 1 map
  task to a TT per heart beat (I did that and achieved 100% utilization
  even with small map tasks). But I am wondering, if doing so will
  violate some fairness properties.
 
  Thanks,
  Abhishek
 



 --
 Todd Lipcon
 Software Engineer, Cloudera