Re: Global Sorting and Multiple Reducers ?

2010-11-11 Thread Anthony Urso
It really comes down to generating quantiles of your values and using them to parition the values to reducers for partial ordering. Check out the Hadoop TeraSort code. It should do what you want. On Thu, Nov 11, 2010 at 10:37 AM, Shuja Rehman wrote: > Hi All, > > I have a question about map red

Global Sorting and Multiple Reducers ?

2010-11-11 Thread Shuja Rehman
Hi All, I have a question about map reduce. Suppose I have set of small files (say 100) usually having size 8-15 MB and need to process in a single job. For each file, there will be 1 map process and hence 100 map process will be initiated for 100 files. Now the question is about number of reducer