All reducers are not being utilized

2012-08-02 Thread Saurabh Bajaj
Hi everyone, I was running a MR job in java and this scenario happened: Case 1: Number of distinct output keys from mapper = 3 Expected # of reducers = 3 Defined set # of reducers to be called = 2 Expected outcome: # of reducers spawned = 2 # of keys processed under first reducer = 1 #

Re: All reducers are not being utilized

2012-08-02 Thread Harsh J
Saurabh, I do not see you talk about defining a custom Partitioner that can guarantee such perfect key distribution. The default partitioner is the HashPartitioner that can only guarantee randomized distribution (as it is key data specific). Hence, your test here with just 3 keys is not really a

Re: All reducers are not being utilized

2012-08-02 Thread Harsh J
? ** ** Saurabh ** ** *From:* Harsh J [mailto:ha...@cloudera.com] *Sent:* Thursday, August 02, 2012 4:05 PM *To:* mapreduce-user@hadoop.apache.org *Subject:* Re: All reducers are not being utilized ** ** Saurabh, ** ** I do not see you talk about defining a custom Partitioner that can

Re: All reducers are not being utilized

2012-08-02 Thread Steve Sonnenberg
] *Sent:* Thursday, August 02, 2012 4:05 PM *To:* mapreduce-user@hadoop.apache.org *Subject:* Re: All reducers are not being utilized ** ** Saurabh, ** ** I do not see you talk about defining a custom Partitioner that can guarantee such perfect key distribution. The default

Re: All reducers are not being utilized

2012-08-02 Thread Bejoy Ks
Hi Saurab/Steve From my understanding the schedulers in hadoop consider only data locality(for map tasks) and availability of slots for scheduling tasks on various nodes. Say if you have a 3 TT nodes with 2 reducer slots each (assume all slots are free) . If we execute a map reduce job with 3

Re: All reducers are not being utilized

2012-08-02 Thread Steve Sonnenberg
If I have 2 nodes, and 150 input files in a single 'input' directory to search using the 'grep' example, isn't it reasonable that both nodes would be involved? Thanks On Thu, Aug 2, 2012 at 3:31 PM, Bejoy Ks bejoy.had...@gmail.com wrote: Hi Saurab/Steve From my understanding the schedulers