Hi, all
I have a question that whether all the intermediate output with the same
key go to the same reducer or not?
If it is, in case of only two keys are generated from mapper, but there are
3 reducer running in this job, what would happen?
If not, how could I do some processing over the all
Hi,
Yes. By contract, all intermediate output with the same key goes to
the same reducer.
In your example, suppose of the two keys generated from the mapper,
one key goes to reducer 1 and the second goes to reducer 2, reducer 3
will not have any records to process and end without producing any
Hi
If not, how could I do some processing over the all data, like counting?
Maybe you can refer to the teraSort example in hadoop. it use a partitioner
that splits text keys into roughly equal partitions in a global sorted
order.
On Thu, Sep 20, 2012 at 9:28 PM, Hemanth Yamijala