got it. thanks for clarification
On Wed, Nov 21, 2012 at 3:03 PM, Bejoy KS <bejoy.had...@gmail.com> wrote: > ** > Hi Jamal > > It is performed at a frame work level map emits key value pairs and the > framework collects and groups all the values corresponding to a key from > all the map tasks. Now the reducer takes the input as a key and a > collection of values only. The reduce method signature defines it. > > Regards > Bejoy KS > > Sent from handheld, please excuse typos. > ------------------------------ > *From: * jamal sasha <jamalsha...@gmail.com> > *Date: *Wed, 21 Nov 2012 14:50:51 -0500 > *To: *user@hadoop.apache.org<user@hadoop.apache.org> > *ReplyTo: * user@hadoop.apache.org > *Subject: *fundamental doubt > > Hi.. > I guess i am asking alot of fundamental questions but i thank you guys for > taking out time to explain my doubts. > So i am able to write map reduce jobs but here is my mydoubt > As of now i am writing mappers which emit key and a value > This key value is then captured at reducer end and then i process the key > and value there. > Let's say i want to calculate the average... > Key1 value1 > Key2 value 2 > Key 1 value 3 > > So the output is something like > Key1 average of value 1 and value 3 > Key2 average 2 = value 2 > > Right now in reducer i have to create a dictionary with key as original > keys and value is a list. > Data = defaultdict(list) == // python usrr > But i thought that > Mapper takes in the key value pairs and outputs key: ( v1,v2....)and > Reducer takes in this key and list of values and returns > Key , new value.. > > So why is the input of reducer the simple output of mapper and not the > list of all the values to a particular key or did i understood something. > Am i making any sense ?? >