Yes, I mean the block size of HDFS. Since there is a combiner in picture
in buildClusters, so, there might not be enough rows to process for the
reduce tasks. Just a wild guess.You can also try with a larger input data.

On 11-03-2012 16:49, WangRamon wrote:
> Hi Paritosh I think the block size may be the problem too, btw, do you mean 
> the block size of the HDFS? I know its default size is 64MB, but I haven't 
> tried some other size.   Thanks Ramon> Date: Sun, 11 Mar 2012 13:18:52 +0530
>> From: pran...@xebia.com
>> To: user@mahout.apache.org
>> Subject: Re: Not all Mapper/Reducer slots are taken when running K-Means 
>> cluster
>>
>> Can you try reducing/increasing you block and see the impact?
>> I am suspecting block size to be the problem.
>>
>> I have faced the same problem once ( for a different hadoop job, and it
>> was very hard to debug it ). In that case, CompositeInputFormat was
>> being used as input, which used to fix the block size to 64 MB, and
>> hence, only few reducers were activated. So, trying different block
>> sizes might give some clue.
>>
>> On 11-03-2012 11:04, WangRamon wrote:
>>> Here is the configuration:   <property>
>>>         <name>mapred.tasktracker.map.tasks.maximum</name>
>>>         <value>14</value>
>>>     </property>
>>>     <property>
>>>         <name>mapred.tasktracker.reduce.tasks.maximum</name>
>>>         <value>14</value>
>>>     </property>
>>>     <property>
>>>         <name>mapred.reduce.tasks</name>
>>>         <value>73</value>
>>>     </property>
>>>  
>>>   Each node has a RAM of 32GB, i think it should be fine to have the above 
>>> configuartion.
>>>  > Date: Sat, 10 Mar 2012 22:31:44 -0700
>>>> From: j...@windwardsolutions.com
>>>> To: user@mahout.apache.org
>>>> Subject: Re: Not all Mapper/Reducer slots are taken when running K-Means 
>>>> cluster
>>>>
>>>> What's your Hadoop config in terms of the maximum number of reducers?
>>>> It's a function of your available RAM on each node and numbers of nodes.
>>>>
>>>> On 3/10/12 8:55 PM, WangRamon wrote:
>>>>> Hi Paritosh    I did the tests with 1 job and 5 jobs, they all have the 
>>>>> same problem, the job i'm running is the buildClusters one, I can see 
>>>>> there are 73 reduce tasks created from the monitor GUI, but only 12 of 
>>>>> them are running at any time (the rest are in pending state), the task 
>>>>> finished very quickly, it's about no more than 18 seconds to finish every 
>>>>> reduce task, so maybe that's the cause? Thanks    Cheers  Ramon
>>>>>  > Date: Sun, 11 Mar 2012 09:14:15 +0530
>>>>>> From: pran...@xebia.com
>>>>>> To: user@mahout.apache.org
>>>>>> Subject: Re: Not all Mapper/Reducer slots are taken when running K-Means 
>>>>>> cluster
>>>>>>
>>>>>> And to answer the question about KMeans configuration :
>>>>>>
>>>>>> Kmeans has two jobs :
>>>>>> 1) builClusters : has a reducer and has no limitation on the number of
>>>>>> reducer tasks
>>>>>> 2) clusterData : executes if runClustering = true, has no reducer tasks
>>>>>>
>>>>>> On 11-03-2012 09:10, Paritosh Ranjan wrote:
>>>>>>> Can you run K-means jobs again ( all with the same block size ) and give
>>>>>>> same statistics for :
>>>>>>>
>>>>>>> a) only 1 job running
>>>>>>> b) 2 jobs running simultaneously
>>>>>>> c) 5 jobs running simultaneously
>>>>>>>
>>>>>>> On 10-03-2012 21:08, WangRamon wrote:
>>>>>>>> Hi All  I submit 5  K-Means Jobs simultaneously, my Hadoop cluster 
>>>>>>>> have 42 map and 42 reduce slots configured, I set the default reduce 
>>>>>>>> task per job as 73 (42 * 1.75), I find there are always about 12 of 
>>>>>>>> the reduce tasks are running at any time although there are 73 reduce 
>>>>>>>> tasks created for each of the K-Means job and i do have 42 reduce 
>>>>>>>> slots, it means at anytime i have about 30 reduce slots free. So i 
>>>>>>>> tried RecommenderJob from mahout again, i remember that job will use 
>>>>>>>> all my slots in my previouse test, and YES for this time, 
>>>>>>>> "RowSimilarityJob-CooccurrencesMapper-Reducer" do use all the slots 42 
>>>>>>>> reduce and 42 map, so I'm wondering is that something configured in 
>>>>>>>> Mahout which cause this strange behavior? Any suggestions? Thanks in 
>>>>>>>> advance.   Btw, i'm using mahout-0.6 release. Cheers Ramon             
>>>>>>>>                             
>>>>>                                     
>>>                                       
>                                         

Reply via email to