It really comes down to generating quantiles of your values and using
them to parition the values to reducers for partial ordering.
Check out the Hadoop TeraSort code. It should do what you want.
On Thu, Nov 11, 2010 at 10:37 AM, Shuja Rehman wrote:
> Hi All,
>
> I have a question about map red
Hi All,
I have a question about map reduce. Suppose I have set of small files (say
100) usually having size 8-15 MB and need to process in a single job. For
each file, there will be 1 map process and hence 100 map process will be
initiated for 100 files. Now the question is about number of reducer