Hi Rob, Just curious and off topic :) How do you find the time taken by each reducer? What command/method do you use? I need that for my research.
Thanks, Mithila On Sat, Dec 11, 2010 at 4:05 AM, Rob Stewart <robstewar...@googlemail.com>wrote: > Hi, > > I have a problem with a MapReduce job I am trying to run on a 32 node > cluster. > > The final few reducers take a *lot* longer than the rest. e.g. If I > specify 100 reducers, the first 90 will complete in 5 minutes, and > then the remaining 10 reducers might take 10 minutes. > > Same is true for any number of reducers... 200 reducers: 180/190 will > complete in 5 minutes, and the last 10/20 will take 10 minutes. > > Is this normal Hadoop behavior? I know that the output of the Reducer > function is not sorted, so can't figure out why this decline of > performance at the tail end of the job? > > thanks, > > Rob Stewart >