Will all the intermediate output with the same key go to the same reducer?

2012-09-20 Thread Jason Yang
Hi, all I have a question that whether all the intermediate output with the same key go to the same reducer or not? If it is, in case of only two keys are generated from mapper, but there are 3 reducer running in this job, what would happen? If not, how could I do some processing over the all

Re: Will all the intermediate output with the same key go to the same reducer?

2012-09-20 Thread Hemanth Yamijala
Hi, Yes. By contract, all intermediate output with the same key goes to the same reducer. In your example, suppose of the two keys generated from the mapper, one key goes to reducer 1 and the second goes to reducer 2, reducer 3 will not have any records to process and end without producing any

Re: Will all the intermediate output with the same key go to the same reducer?

2012-09-20 Thread feng lu
Hi If not, how could I do some processing over the all data, like counting? Maybe you can refer to the teraSort example in hadoop. it use a partitioner that splits text keys into roughly equal partitions in a global sorted order. On Thu, Sep 20, 2012 at 9:28 PM, Hemanth Yamijala