Re: Why most of the free reduce slots are NOT used for my Hadoop Jobs? Thanks.

Joey Echeverria Sat, 10 Mar 2012 04:01:03 -0800

What does the jobtracker web page say is the total reduce capacity?

-Joey




On Mar 10, 2012, at 5:39, WangRamon <ramon_w...@hotmail.com> wrote:

> Hi All
>  
> I'm using Hadoop-0.20-append, the cluster contains 3 nodes, for each node I 
> have 14 map and 14 reduce slots, here is the configuration:
>  
>  
>     <property>
>         <name>mapred.tasktracker.map.tasks.maximum</name>
>         <value>14</value>
>     </property>
>     <property>
>         <name>mapred.tasktracker.reduce.tasks.maximum</name>
>         <value>14</value>
>     </property>
>     <property>
>         <name>mapred.reduce.tasks</name>
>         <value>73</value>
>     </property>
> 
>  
> When I submit 5 Jobs simultane ously (the input data for each job is not so 
> big for the test, it's about 2~5M in size), I assume the Jobs will use the 
> slots as much as possible, each Job did created 73 Reduce Tasks as configured 
> above, so there will be 5 * 73 Reduce Tasks in total, but, most of them are 
> in pending state, only about 12 of them are running, it's too small compared 
> to the total slots number for reduce, 42 reduce slots for the 3 nodes 
> cluster. 
>  
> What interestring is that it always about 12 of them are running, I tried a 
> few times.
>  
> So, I thought it might because about the scheduler, I changed it to Fair 
> Scheduler, I created 3 pools, the configure is as below:
>  
> <?xml version="1.0"?>
> <allocations>
>  <pool name="pool-a">
>   <minMaps>14</minMaps>
>   <minReduces>14</minReduces>
>   <weight>1.0</weight>
>  </pool>
>  <pool name="pool -b">
>   <minMaps>14</minMaps>
>   <minReduces>14</minReduces>
>   <weight>1.0</weight>
>  </pool>
>  <pool name="pool-c">
>   <minMaps>14</minMaps>
>   <minReduces>14</minReduces>
>   <weight>1.0</weight>
>  </pool>
>  
> </allocations> 
>  
> Then I submit the 5 Jobs simultaneously to these pools randomly again, I can 
> see the jobs were assigned to different pools, but, it's still the same 
> problem only about 12 of the reduce tasks from different pool are running, 
> here is the output i copied from the Fair Scheduler monitor GUI:
>  
> pool-a 2 14 14 0 9
> pool-b 0 14 14 0 0 
> pool-c 2 14 14 0 3 
>  
> pool-a and pool-c have a total of 12 reduce tasks running, but I do have 
> about 11 reduce slots at least available in my cluster.
>  
> So can anyone please give me some suggestions, why NOT all my REDUCE SLOTS 
> are working? Thanks in advance. 
>  
> Cheers 
> Ramon

Re: Why most of the free reduce slots are NOT used for my Hadoop Jobs? Thanks.

Reply via email to