Re: How to order all the output file if I use more than one reduce node?

Taeho Kang Wed, 06 Aug 2008 19:54:12 -0700

You may want to write a partitioner that partitions the output from mappers
in a way that fits your definition of sorted data (e.g. all keys in
part-00001 are greater than those in part-00000.) Once you've done it, just
merging all the reduce output from 0 to N will give you a sorted result
file.



On Thu, Aug 7, 2008 at 10:26 AM, Kevin <[EMAIL PROTECTED]> wrote:

> I suppose you meant to sort the result globally across files. AFAIK,
> This is not currently supported unless you have only one reducer. It
> is said that version 0.19 will introduce such capability.
>
> -Kevin
>
>
>
> On Wed, Aug 6, 2008 at 6:01 PM, Xing <[EMAIL PROTECTED]> wrote:
> > If I use one node for reduce, hadoop can sort the result.
> > If I use 30 nodes for reduce, the result is part-00000 ~ part-00029.
> > How make all the 30 parts sort globally and all the files in part-00001
> are
> > greater that part-00000 ?
> > Thanks a lot
> >
> > Xing
> >
>

Re: How to order all the output file if I use more than one reduce node?

Reply via email to