On Jul 11, 2008, at 1:35 PM, Mori Bellamy wrote:

hey all,
what dictates the "% complete" bars for maptasks and reduce tasks? i ask because, for one of my map jobs, the tasks hang at 0% for a long time until they jump to 100%.


Maps -> amount of input consumed (this is the normal case when you are processing data on HDFS) Reduces -> Shuffle is 0-33% (shuffle is the phase where you copy output of the maps), Merge is 33-66% (here sorted map-outputs are being merged), rest is reduce (where user's Reducer.reduce methods are being invoked).

Arun

Reply via email to