Try this

Map: emit key=product_id value=new MyMap<product_id,timestamp>()

In the reducer output u can place

for a given product_id
                    HashMap.size() -> no of times a product was searched.
                    HashMap.max() -> date with the highest count


On Wed, Jul 4, 2012 at 4:48 AM, Eugene Kirpichov <ekirpic...@gmail.com>wrote:

> Well, then you can simply do it like this:
> Map: emit key=product_id value=date
> Reduce for a particular product_id: manually count (in a hashtable)
> dates and their counts, return the date with the highest count
>
> Assuming you've started selling products later than computers were
> invented, this should be fine w.r.t. performance and memory
> consumption :)
>
> On Tue, Jul 3, 2012 at 3:52 PM, Shailesh Samudrala
> <shailesh2...@gmail.com> wrote:
> > Yes, I think that is possible, but I'm looking for a 1 MapReduce job
> > solution, if possible.
> >
> > On Tue, Jul 3, 2012 at 3:46 PM, Eugene Kirpichov <ekirpic...@gmail.com
> >wrote:
> >
> >> Ok, I see, so you need to 1) group and count everything group by date
> >> and product_id => {date, product_id, count} (this is 1 map+reduce) 2)
> >> group this by product_id and get the value of date for which cnt is
> >> highest (this is another 1 map+reduce).
> >> Does this sound sensible?
> >>
> >> I'm not sure if this can be efficiently done with just 1 stage of
> >> map+reduce.
> >>
> >> On Tue, Jul 3, 2012 at 3:36 PM, Shailesh Samudrala
> >> <shailesh2...@gmail.com> wrote:
> >> > i want to find out how many times a product was searched during a day,
> >> and
> >> > then select the day when this is highest.
> >> >
> >> > Until now, I have extracted all the required fields from the search
> >> string,
> >> > and I am confused about what exactly I should be passing from the
> mapper
> >> to
> >> > the reducer.
> >> >
> >> > On Tue, Jul 3, 2012 at 3:30 PM, Eugene Kirpichov <
> ekirpic...@gmail.com
> >> >wrote:
> >> >
> >> >> So you want to compute select max(date) from log group by product?
> >> >> Can you describe how far you have advanced so far and where precisely
> >> >> are you stuck?
> >> >>
> >> >> On Tue, Jul 3, 2012 at 3:23 PM, Shailesh Samudrala
> >> >> <shailesh2...@gmail.com> wrote:
> >> >> > I am writing a sample application to analyze some log files of
> webpage
> >> >> > accesses. Basically, the log files record which products where
> >> accessed,
> >> >> > and on what date.
> >> >> > I want to write a MapReduce program to determine on what date was a
> >> >> product
> >> >> > most accessed.
> >> >> > Please share your ideas with me. Thanks!
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Eugene Kirpichov
> >> >> http://www.linkedin.com/in/eugenekirpichov
> >> >>
> >>
> >>
> >>
> >> --
> >> Eugene Kirpichov
> >> http://www.linkedin.com/in/eugenekirpichov
> >>
>
>
>
> --
> Eugene Kirpichov
> http://www.linkedin.com/in/eugenekirpichov
>

Reply via email to