Try this Map: emit key=product_id value=new MyMap<product_id,timestamp>()
In the reducer output u can place for a given product_id HashMap.size() -> no of times a product was searched. HashMap.max() -> date with the highest count On Wed, Jul 4, 2012 at 4:48 AM, Eugene Kirpichov <ekirpic...@gmail.com>wrote: > Well, then you can simply do it like this: > Map: emit key=product_id value=date > Reduce for a particular product_id: manually count (in a hashtable) > dates and their counts, return the date with the highest count > > Assuming you've started selling products later than computers were > invented, this should be fine w.r.t. performance and memory > consumption :) > > On Tue, Jul 3, 2012 at 3:52 PM, Shailesh Samudrala > <shailesh2...@gmail.com> wrote: > > Yes, I think that is possible, but I'm looking for a 1 MapReduce job > > solution, if possible. > > > > On Tue, Jul 3, 2012 at 3:46 PM, Eugene Kirpichov <ekirpic...@gmail.com > >wrote: > > > >> Ok, I see, so you need to 1) group and count everything group by date > >> and product_id => {date, product_id, count} (this is 1 map+reduce) 2) > >> group this by product_id and get the value of date for which cnt is > >> highest (this is another 1 map+reduce). > >> Does this sound sensible? > >> > >> I'm not sure if this can be efficiently done with just 1 stage of > >> map+reduce. > >> > >> On Tue, Jul 3, 2012 at 3:36 PM, Shailesh Samudrala > >> <shailesh2...@gmail.com> wrote: > >> > i want to find out how many times a product was searched during a day, > >> and > >> > then select the day when this is highest. > >> > > >> > Until now, I have extracted all the required fields from the search > >> string, > >> > and I am confused about what exactly I should be passing from the > mapper > >> to > >> > the reducer. > >> > > >> > On Tue, Jul 3, 2012 at 3:30 PM, Eugene Kirpichov < > ekirpic...@gmail.com > >> >wrote: > >> > > >> >> So you want to compute select max(date) from log group by product? > >> >> Can you describe how far you have advanced so far and where precisely > >> >> are you stuck? > >> >> > >> >> On Tue, Jul 3, 2012 at 3:23 PM, Shailesh Samudrala > >> >> <shailesh2...@gmail.com> wrote: > >> >> > I am writing a sample application to analyze some log files of > webpage > >> >> > accesses. Basically, the log files record which products where > >> accessed, > >> >> > and on what date. > >> >> > I want to write a MapReduce program to determine on what date was a > >> >> product > >> >> > most accessed. > >> >> > Please share your ideas with me. Thanks! > >> >> > >> >> > >> >> > >> >> -- > >> >> Eugene Kirpichov > >> >> http://www.linkedin.com/in/eugenekirpichov > >> >> > >> > >> > >> > >> -- > >> Eugene Kirpichov > >> http://www.linkedin.com/in/eugenekirpichov > >> > > > > -- > Eugene Kirpichov > http://www.linkedin.com/in/eugenekirpichov >