Correction* Try this
Map: emit key=product_id value=new MyMap<product_id,timestamp>() In the reducer output u can place for a given product_id MyMap.size() -> no of times a product was searched. MyMap.max() -> date with the highest count *MyMap *state should be preserved across multiple calls of the map method i.e for each input key-value pair. On Wed, Jul 4, 2012 at 12:21 PM, Sambit Tripathy <sambi...@gmail.com> wrote: > Try this > > Map: emit key=product_id value=new MyMap<product_id,timestamp>() > > In the reducer output u can place > > for a given product_id > HashMap.size() -> no of times a product was searched. > HashMap.max() -> date with the highest count > > > > On Wed, Jul 4, 2012 at 4:48 AM, Eugene Kirpichov <ekirpic...@gmail.com>wrote: > >> Well, then you can simply do it like this: >> Map: emit key=product_id value=date >> Reduce for a particular product_id: manually count (in a hashtable) >> dates and their counts, return the date with the highest count >> >> Assuming you've started selling products later than computers were >> invented, this should be fine w.r.t. performance and memory >> consumption :) >> >> On Tue, Jul 3, 2012 at 3:52 PM, Shailesh Samudrala >> <shailesh2...@gmail.com> wrote: >> > Yes, I think that is possible, but I'm looking for a 1 MapReduce job >> > solution, if possible. >> > >> > On Tue, Jul 3, 2012 at 3:46 PM, Eugene Kirpichov <ekirpic...@gmail.com >> >wrote: >> > >> >> Ok, I see, so you need to 1) group and count everything group by date >> >> and product_id => {date, product_id, count} (this is 1 map+reduce) 2) >> >> group this by product_id and get the value of date for which cnt is >> >> highest (this is another 1 map+reduce). >> >> Does this sound sensible? >> >> >> >> I'm not sure if this can be efficiently done with just 1 stage of >> >> map+reduce. >> >> >> >> On Tue, Jul 3, 2012 at 3:36 PM, Shailesh Samudrala >> >> <shailesh2...@gmail.com> wrote: >> >> > i want to find out how many times a product was searched during a >> day, >> >> and >> >> > then select the day when this is highest. >> >> > >> >> > Until now, I have extracted all the required fields from the search >> >> string, >> >> > and I am confused about what exactly I should be passing from the >> mapper >> >> to >> >> > the reducer. >> >> > >> >> > On Tue, Jul 3, 2012 at 3:30 PM, Eugene Kirpichov < >> ekirpic...@gmail.com >> >> >wrote: >> >> > >> >> >> So you want to compute select max(date) from log group by product? >> >> >> Can you describe how far you have advanced so far and where >> precisely >> >> >> are you stuck? >> >> >> >> >> >> On Tue, Jul 3, 2012 at 3:23 PM, Shailesh Samudrala >> >> >> <shailesh2...@gmail.com> wrote: >> >> >> > I am writing a sample application to analyze some log files of >> webpage >> >> >> > accesses. Basically, the log files record which products where >> >> accessed, >> >> >> > and on what date. >> >> >> > I want to write a MapReduce program to determine on what date was >> a >> >> >> product >> >> >> > most accessed. >> >> >> > Please share your ideas with me. Thanks! >> >> >> >> >> >> >> >> >> >> >> >> -- >> >> >> Eugene Kirpichov >> >> >> http://www.linkedin.com/in/eugenekirpichov >> >> >> >> >> >> >> >> >> >> >> -- >> >> Eugene Kirpichov >> >> http://www.linkedin.com/in/eugenekirpichov >> >> >> >> >> >> -- >> Eugene Kirpichov >> http://www.linkedin.com/in/eugenekirpichov >> > >