Is it possible to sort values before they are sent to the reduce function?

2010-06-13 Thread Kevin Tse
Hi,
For each key, there might be millions of values(LongWritable), but I only
want to emit top 20 of these values which I want to be sorted in descending
order.
So is it possible to sort these values before they enter the reduce phase?

Thank you in advance!
Kevin


Re: Is it possible to sort values before they are sent to the reduce function?

2010-06-13 Thread Alex Kozlov
Hi Kevin, This is a very common technique.  Look for secondary sort in Tom
White's HTGD (Chapter 6).  You'll most likely have to write your own
Partitioner and WritableComparator.  -- Alex K

On Sun, Jun 13, 2010 at 7:16 PM, Kevin Tse  wrote:

> Hi,
> For each key, there might be millions of values(LongWritable), but I only
> want to emit top 20 of these values which I want to be sorted in descending
> order.
> So is it possible to sort these values before they enter the reduce phase?
>
> Thank you in advance!
> Kevin
>


Re: Is it possible to sort values before they are sent to the reduce function?

2010-06-13 Thread Kevin Tse
Hi Alex,
I am was reading Tom's book, but I have not reached chapter 6 yet. I just
read it, it is really helpful.
Thank you for mentioning it, and Thanks also goes to Tom.

Kevin

On Mon, Jun 14, 2010 at 10:22 AM, Alex Kozlov  wrote:

> Hi Kevin, This is a very common technique.  Look for secondary sort in Tom
> White's HTGD (Chapter 6).  You'll most likely have to write your own
> Partitioner and WritableComparator.  -- Alex K
>
> On Sun, Jun 13, 2010 at 7:16 PM, Kevin Tse 
> wrote:
>
> > Hi,
> > For each key, there might be millions of values(LongWritable), but I only
> > want to emit top 20 of these values which I want to be sorted in
> descending
> > order.
> > So is it possible to sort these values before they enter the reduce
> phase?
> >
> > Thank you in advance!
> > Kevin
> >
>