Hi Harsh, Thanks for clarifying. I was in thought earlier that Partitioner is picking the reducer.
My cluster setup provides options for multiple reducers so i want to know when and in which scenario we have go for multiple reducers? Cheers! Manoj. On Mon, Jul 9, 2012 at 11:27 PM, Harsh J <ha...@cloudera.com> wrote: > Manoj, > > Think of it this way, and you shouldn't be confused: A reducer == a > partition. > > For (1) - Partitioners do not 'call' a reduce, just write the data > with a proper partition ID. The reducer thats same as the partition > ID, picks it up for itself later. This we have already explained > earlier. > > For (2) - For what scenario do you _not_ want multiple reducers > handling each partition uniquely, when it is possible to scale that > way? > > On Mon, Jul 9, 2012 at 11:22 PM, Manoj Babu <manoj...@gmail.com> wrote: > > Hi, > > > > It would be more helpful, If you could more details for the below doubts. > > > > 1, How the partitioner knows which reducer needs to be called? > > 2, When we are using more than one reducers, the output gets separated. > > Actually for what scenario we have to go for multiple reducers? > > > > Cheers! > > Manoj. > > > > > > > > On Mon, Jul 9, 2012 at 6:54 PM, Arun C Murthy <a...@hortonworks.com> > wrote: > >> > >> Robert, > >> > >> On Jul 7, 2012, at 6:37 PM, Grandl Robert wrote: > >> > >> Hi, > >> > >> I have some questions related to basic functionality in Hadoop. > >> > >> 1. When a Mapper process the intermediate output data, how it knows how > >> many partitions to do(how many reducers will be) and how much data to > go in > >> each partition for each reducer ? > >> > >> 2. A JobTracker when assigns a task to a reducer, it will also specify > the > >> locations of intermediate output data where it should retrieve it right > ? > >> But how a reducer will know from each remote location with intermediate > >> output what portion it has to retrieve only ? > >> > >> > >> To add to Harsh's comment. Essentially the TT *knows* where the output > of > >> a given map-id/reduce-id pair is present via an output-file/index-file > >> combination. > >> > >> Arun > >> > >> -- > >> Arun C. Murthy > >> Hortonworks Inc. > >> http://hortonworks.com/ > >> > >> > > > > > > -- > Harsh J >