Re: How to order all the output file if I use more than one reduce node?

2008-08-06 Thread Taeho Kang
You may want to write a partitioner that partitions the output from mappers
in a way that fits your definition of sorted data (e.g. all keys in
part-1 are greater than those in part-0.) Once you've done it, just
merging all the reduce output from 0 to N will give you a sorted result
file.


On Thu, Aug 7, 2008 at 10:26 AM, Kevin <[EMAIL PROTECTED]> wrote:

> I suppose you meant to sort the result globally across files. AFAIK,
> This is not currently supported unless you have only one reducer. It
> is said that version 0.19 will introduce such capability.
>
> -Kevin
>
>
>
> On Wed, Aug 6, 2008 at 6:01 PM, Xing <[EMAIL PROTECTED]> wrote:
> > If I use one node for reduce, hadoop can sort the result.
> > If I use 30 nodes for reduce, the result is part-0 ~ part-00029.
> > How make all the 30 parts sort globally and all the files in part-1
> are
> > greater that part-0 ?
> > Thanks a lot
> >
> > Xing
> >
>


Re: How to order all the output file if I use more than one reduce node?

2008-08-06 Thread Kevin
I suppose you meant to sort the result globally across files. AFAIK,
This is not currently supported unless you have only one reducer. It
is said that version 0.19 will introduce such capability.

-Kevin



On Wed, Aug 6, 2008 at 6:01 PM, Xing <[EMAIL PROTECTED]> wrote:
> If I use one node for reduce, hadoop can sort the result.
> If I use 30 nodes for reduce, the result is part-0 ~ part-00029.
> How make all the 30 parts sort globally and all the files in part-1 are
> greater that part-0 ?
> Thanks a lot
>
> Xing
>


How to order all the output file if I use more than one reduce node?

2008-08-06 Thread Xing

If I use one node for reduce, hadoop can sort the result.
If I use 30 nodes for reduce, the result is part-0 ~ part-00029.
How make all the 30 parts sort globally and all the files in part-1 
are greater that part-0 ?

Thanks a lot

Xing