You may want to write a partitioner that partitions the output from mappers
in a way that fits your definition of sorted data (e.g. all keys in
part-1 are greater than those in part-0.) Once you've done it, just
merging all the reduce output from 0 to N will give you a sorted result
file.
I suppose you meant to sort the result globally across files. AFAIK,
This is not currently supported unless you have only one reducer. It
is said that version 0.19 will introduce such capability.
-Kevin
On Wed, Aug 6, 2008 at 6:01 PM, Xing <[EMAIL PROTECTED]> wrote:
> If I use one node for redu
If I use one node for reduce, hadoop can sort the result.
If I use 30 nodes for reduce, the result is part-0 ~ part-00029.
How make all the 30 parts sort globally and all the files in part-1
are greater that part-0 ?
Thanks a lot
Xing