Hi, I am trying to figure out how the reduce progress for a job is calculated. I was looking at the syslog generated by my job run and it looks like the reducers start before the mappers complete. I figured this was the case because even when the Map had <100% completion, the reduce completion % was greater than 0. So, my question is what does a reduce phase completion of, say, 10% actually mean? I know that the shuffle, sort and (actual) reduce each contribute 0.33 to the completion of a single reducers and then the % completion of the MR phases of the job are calculated in in JobInProgress.java as (map *or*reduce) progress = current_progress + deltaprogress/(number_of_maps *or *number of reducers) . So given that the number of reducers are small and fixed (=3 in my case) and that each reducers has only 3 stages of completion (0.33,0.67,1), I would expect this value of progress to increase only by discrete values (of 100/(3*number of reducers)) and not in a continuous manner. I see that my syslog (below) gives all sorts of %'s as the completion.
Thanks, Virajith ============================== ============================================= SYSLOG for a job with 80 mappers and 3 reducers: =========================================================================== 2011-06-04 07:13:22,378 INFO org.apache.hadoop.mapred.JobClient (main): map 0% reduce 0% 2011-06-04 07:13:46,499 INFO org.apache.hadoop.mapred.JobClient (main): map 1% reduce 0% 2011-06-04 07:13:49,506 INFO org.apache.hadoop.mapred.JobClient (main): map 3% reduce 0% 2011-06-04 07:13:52,511 INFO org.apache.hadoop.mapred.JobClient (main): map 4% reduce 0% 2011-06-04 07:13:55,519 INFO org.apache.hadoop.mapred.JobClient (main): map 6% reduce 0% 2011-06-04 07:13:58,526 INFO org.apache.hadoop.mapred.JobClient (main): map 7% reduce 0% 2011-06-04 07:14:01,536 INFO org.apache.hadoop.mapred.JobClient (main): map 8% reduce 0% 2011-06-04 07:14:04,544 INFO org.apache.hadoop.mapred.JobClient (main): map 9% reduce 0% 2011-06-04 07:14:10,561 INFO org.apache.hadoop.mapred.JobClient (main): map 10% reduce 0% 2011-06-04 07:14:13,567 INFO org.apache.hadoop.mapred.JobClient (main): map 11% reduce 0% 2011-06-04 07:14:19,579 INFO org.apache.hadoop.mapred.JobClient (main): map 12% reduce 0% 2011-06-04 07:14:22,585 INFO org.apache.hadoop.mapred.JobClient (main): map 13% reduce 0% 2011-06-04 07:14:25,591 INFO org.apache.hadoop.mapred.JobClient (main): map 14% reduce 0% 2011-06-04 07:14:28,604 INFO org.apache.hadoop.mapred.JobClient (main): map 15% reduce 0% 2011-06-04 07:14:31,610 INFO org.apache.hadoop.mapred.JobClient (main): map 16% reduce 0% 2011-06-04 07:14:34,616 INFO org.apache.hadoop.mapred.JobClient (main): map 17% reduce 1% 2011-06-04 07:14:40,629 INFO org.apache.hadoop.mapred.JobClient (main): map 18% reduce 1% 2011-06-04 07:14:43,634 INFO org.apache.hadoop.mapred.JobClient (main): map 19% reduce 1% 2011-06-04 07:14:44,636 INFO org.apache.hadoop.mapred.JobClient (main): map 19% reduce 2% 2011-06-04 07:14:49,646 INFO org.apache.hadoop.mapred.JobClient (main): map 20% reduce 2% 2011-06-04 07:14:50,648 INFO org.apache.hadoop.mapred.JobClient (main): map 20% reduce 3% 2011-06-04 07:14:59,671 INFO org.apache.hadoop.mapred.JobClient (main): map 21% reduce 4% 2011-06-04 07:15:04,681 INFO org.apache.hadoop.mapred.JobClient (main): map 22% reduce 5% 2011-06-04 07:15:07,688 INFO org.apache.hadoop.mapred.JobClient (main): map 23% reduce 5% 2011-06-04 07:15:10,693 INFO org.apache.hadoop.mapred.JobClient (main): map 24% reduce 5% 2011-06-04 07:15:11,696 INFO org.apache.hadoop.mapred.JobClient (main): map 24% reduce 6% 2011-06-04 07:15:13,700 INFO org.apache.hadoop.mapred.JobClient (main): map 25% reduce 6% 2011-06-04 07:15:14,703 INFO org.apache.hadoop.mapred.JobClient (main): map 26% reduce 6% 2011-06-04 07:15:16,706 INFO org.apache.hadoop.mapred.JobClient (main): map 27% reduce 6% 2011-06-04 07:15:17,709 INFO org.apache.hadoop.mapred.JobClient (main): map 28% reduce 6% 2011-06-04 07:15:20,717 INFO org.apache.hadoop.mapred.JobClient (main): map 29% reduce 6% 2011-06-04 07:15:29,736 INFO org.apache.hadoop.mapred.JobClient (main): map 30% reduce 6% 2011-06-04 07:15:41,762 INFO org.apache.hadoop.mapred.JobClient (main): map 31% reduce 6% 2011-06-04 07:15:43,770 INFO org.apache.hadoop.mapred.JobClient (main): map 32% reduce 7% 2011-06-04 07:15:46,785 INFO org.apache.hadoop.mapred.JobClient (main): map 33% reduce 7% 2011-06-04 07:15:47,787 INFO org.apache.hadoop.mapred.JobClient (main): map 33% reduce 8% 2011-06-04 07:15:49,791 INFO org.apache.hadoop.mapred.JobClient (main): map 34% reduce 8% 2011-06-04 07:15:50,793 INFO org.apache.hadoop.mapred.JobClient (main): map 35% reduce 8% 2011-06-04 07:15:52,798 INFO org.apache.hadoop.mapred.JobClient (main): map 36% reduce 8% 2011-06-04 07:15:55,803 INFO org.apache.hadoop.mapred.JobClient (main): map 37% reduce 8% 2011-06-04 07:15:58,809 INFO org.apache.hadoop.mapred.JobClient (main): map 38% reduce 8% 2011-06-04 07:16:04,821 INFO org.apache.hadoop.mapred.JobClient (main): map 38% reduce 9% 2011-06-04 07:16:07,826 INFO org.apache.hadoop.mapred.JobClient (main): map 39% reduce 9% 2011-06-04 07:16:10,833 INFO org.apache.hadoop.mapred.JobClient (main): Task Id : attempt_201106040712_0001_m_000024_0, Status : FAILED 2011-06-04 07:16:12,086 INFO org.apache.hadoop.mapred.JobClient (main): map 38% reduce 9% 2011-06-04 07:16:14,090 INFO org.apache.hadoop.mapred.JobClient (main): Task Id : attempt_201106040712_0001_m_000027_0, Status : FAILED 2011-06-04 07:16:15,134 INFO org.apache.hadoop.mapred.JobClient (main): map 37% reduce 9% 2011-06-04 07:16:26,162 INFO org.apache.hadoop.mapred.JobClient (main): map 38% reduce 9% 2011-06-04 07:16:30,169 INFO org.apache.hadoop.mapred.JobClient (main): map 39% reduce 9% 2011-06-04 07:16:33,177 INFO org.apache.hadoop.mapred.JobClient (main): map 40% reduce 9% 2011-06-04 07:16:36,184 INFO org.apache.hadoop.mapred.JobClient (main): map 41% reduce 9% 2011-06-04 07:16:42,199 INFO org.apache.hadoop.mapred.JobClient (main): map 42% reduce 9% 2011-06-04 07:17:07,258 INFO org.apache.hadoop.mapred.JobClient (main): map 42% reduce 10% 2011-06-04 07:17:15,274 INFO org.apache.hadoop.mapred.JobClient (main): map 43% reduce 10% 2011-06-04 07:17:16,276 INFO org.apache.hadoop.mapred.JobClient (main): map 44% reduce 10% 2011-06-04 07:17:18,283 INFO org.apache.hadoop.mapred.JobClient (main): map 45% reduce 10% 2011-06-04 07:17:19,285 INFO org.apache.hadoop.mapred.JobClient (main): map 46% reduce 10% 2011-06-04 07:17:22,290 INFO org.apache.hadoop.mapred.JobClient (main): map 47% reduce 10% 2011-06-04 07:17:24,294 INFO org.apache.hadoop.mapred.JobClient (main): map 48% reduce 10% 2011-06-04 07:17:25,296 INFO org.apache.hadoop.mapred.JobClient (main): map 49% reduce 10% 2011-06-04 07:17:28,303 INFO org.apache.hadoop.mapred.JobClient (main): map 50% reduce 11% 2011-06-04 07:17:33,318 INFO org.apache.hadoop.mapred.JobClient (main): map 51% reduce 11% 2011-06-04 07:17:42,337 INFO org.apache.hadoop.mapred.JobClient (main): map 52% reduce 11% 2011-06-04 07:17:46,345 INFO org.apache.hadoop.mapred.JobClient (main): map 53% reduce 11% 2011-06-04 07:17:49,352 INFO org.apache.hadoop.mapred.JobClient (main): map 54% reduce 11% 2011-06-04 07:17:52,463 INFO org.apache.hadoop.mapred.JobClient (main): map 55% reduce 11% 2011-06-04 07:17:55,469 INFO org.apache.hadoop.mapred.JobClient (main): map 56% reduce 11% 2011-06-04 07:17:58,476 INFO org.apache.hadoop.mapred.JobClient (main): map 57% reduce 11% 2011-06-04 07:18:17,525 INFO org.apache.hadoop.mapred.JobClient (main): map 57% reduce 12% 2011-06-04 07:18:19,528 INFO org.apache.hadoop.mapred.JobClient (main): map 58% reduce 12% 2011-06-04 07:18:22,535 INFO org.apache.hadoop.mapred.JobClient (main): map 59% reduce 12% 2011-06-04 07:18:23,537 INFO org.apache.hadoop.mapred.JobClient (main): map 60% reduce 12% 2011-06-04 07:18:25,541 INFO org.apache.hadoop.mapred.JobClient (main): map 60% reduce 13% 2011-06-04 07:18:26,542 INFO org.apache.hadoop.mapred.JobClient (main): map 61% reduce 13% 2011-06-04 07:18:29,548 INFO org.apache.hadoop.mapred.JobClient (main): map 63% reduce 13% 2011-06-04 07:18:32,554 INFO org.apache.hadoop.mapred.JobClient (main): map 64% reduce 13% 2011-06-04 07:18:35,560 INFO org.apache.hadoop.mapred.JobClient (main): map 65% reduce 13% 2011-06-04 07:18:38,566 INFO org.apache.hadoop.mapred.JobClient (main): map 66% reduce 14% 2011-06-04 07:18:44,580 INFO org.apache.hadoop.mapred.JobClient (main): map 67% reduce 14% 2011-06-04 07:18:47,586 INFO org.apache.hadoop.mapred.JobClient (main): map 67% reduce 15% 2011-06-04 07:18:50,593 INFO org.apache.hadoop.mapred.JobClient (main): map 68% reduce 15% 2011-06-04 07:18:53,601 INFO org.apache.hadoop.mapred.JobClient (main): map 69% reduce 15% 2011-06-04 07:18:59,614 INFO org.apache.hadoop.mapred.JobClient (main): map 70% reduce 15% 2011-06-04 07:19:02,621 INFO org.apache.hadoop.mapred.JobClient (main): map 71% reduce 15% 2011-06-04 07:19:08,633 INFO org.apache.hadoop.mapred.JobClient (main): map 72% reduce 15% 2011-06-04 07:19:16,653 INFO org.apache.hadoop.mapred.JobClient (main): map 73% reduce 15% 2011-06-04 07:19:19,660 INFO org.apache.hadoop.mapred.JobClient (main): map 74% reduce 15% 2011-06-04 07:19:25,673 INFO org.apache.hadoop.mapred.JobClient (main): map 75% reduce 15% 2011-06-04 07:19:28,680 INFO org.apache.hadoop.mapred.JobClient (main): map 76% reduce 15% 2011-06-04 07:19:30,684 INFO org.apache.hadoop.mapred.JobClient (main): map 76% reduce 16% 2011-06-04 07:19:36,699 INFO org.apache.hadoop.mapred.JobClient (main): map 77% reduce 16% 2011-06-04 07:19:39,706 INFO org.apache.hadoop.mapred.JobClient (main): map 78% reduce 16% 2011-06-04 07:19:42,714 INFO org.apache.hadoop.mapred.JobClient (main): map 79% reduce 16% 2011-06-04 07:19:48,730 INFO org.apache.hadoop.mapred.JobClient (main): map 80% reduce 17% 2011-06-04 07:19:51,736 INFO org.apache.hadoop.mapred.JobClient (main): map 81% reduce 17% 2011-06-04 07:19:58,750 INFO org.apache.hadoop.mapred.JobClient (main): map 82% reduce 17% 2011-06-04 07:20:01,757 INFO org.apache.hadoop.mapred.JobClient (main): map 83% reduce 18% 2011-06-04 07:20:04,764 INFO org.apache.hadoop.mapred.JobClient (main): map 84% reduce 18% 2011-06-04 07:20:09,775 INFO org.apache.hadoop.mapred.JobClient (main): map 85% reduce 18% 2011-06-04 07:20:12,785 INFO org.apache.hadoop.mapred.JobClient (main): map 86% reduce 18% 2011-06-04 07:20:15,791 INFO org.apache.hadoop.mapred.JobClient (main): map 87% reduce 19% 2011-06-04 07:20:18,796 INFO org.apache.hadoop.mapred.JobClient (main): map 88% reduce 19% 2011-06-04 07:20:21,803 INFO org.apache.hadoop.mapred.JobClient (main): map 89% reduce 19% 2011-06-04 07:20:24,810 INFO org.apache.hadoop.mapred.JobClient (main): map 90% reduce 19% 2011-06-04 07:20:29,820 INFO org.apache.hadoop.mapred.JobClient (main): Task Id : attempt_201106040712_0001_m_000069_0, Status : FAILED 2011-06-04 07:20:31,270 INFO org.apache.hadoop.mapred.JobClient (main): map 89% reduce 19% 2011-06-04 07:20:37,284 INFO org.apache.hadoop.mapred.JobClient (main): map 90% reduce 19% 2011-06-04 07:20:46,303 INFO org.apache.hadoop.mapred.JobClient (main): map 91% reduce 20% 2011-06-04 07:20:49,310 INFO org.apache.hadoop.mapred.JobClient (main): map 92% reduce 20% 2011-06-04 07:20:52,315 INFO org.apache.hadoop.mapred.JobClient (main): map 93% reduce 20% 2011-06-04 07:20:55,323 INFO org.apache.hadoop.mapred.JobClient (main): map 94% reduce 20% 2011-06-04 07:20:58,330 INFO org.apache.hadoop.mapred.JobClient (main): map 95% reduce 20% 2011-06-04 07:20:59,333 INFO org.apache.hadoop.mapred.JobClient (main): Task Id : attempt_201106040712_0001_m_000073_0, Status : FAILED 2011-06-04 07:21:00,347 INFO org.apache.hadoop.mapred.JobClient (main): map 94% reduce 20% 2011-06-04 07:21:04,354 INFO org.apache.hadoop.mapred.JobClient (main): map 95% reduce 20% 2011-06-04 07:21:13,377 INFO org.apache.hadoop.mapred.JobClient (main): map 96% reduce 20% 2011-06-04 07:21:19,390 INFO org.apache.hadoop.mapred.JobClient (main): map 98% reduce 20% 2011-06-04 07:21:25,401 INFO org.apache.hadoop.mapred.JobClient (main): map 99% reduce 21% 2011-06-04 07:21:40,441 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 21% 2011-06-04 07:22:22,539 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 33% 2011-06-04 07:22:31,570 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 35% 2011-06-04 07:22:35,579 INFO org.apache.hadoop.mapred.JobClient (main): Task Id : attempt_201106040712_0001_r_000000_0, Status : FAILED 2011-06-04 07:22:36,659 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 29% 2011-06-04 07:22:40,668 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 30% 2011-06-04 07:22:45,679 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 31% 2011-06-04 07:22:51,690 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 32% 2011-06-04 07:22:55,700 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 33% 2011-06-04 07:23:01,714 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 34% 2011-06-04 07:23:10,733 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 35% 2011-06-04 07:23:26,766 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 36% 2011-06-04 07:23:32,778 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 37% 2011-06-04 07:23:52,820 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 38% 2011-06-04 07:23:56,827 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 39% 2011-06-04 07:24:05,848 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 40% 2011-06-04 07:24:16,871 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 41% 2011-06-04 07:24:34,908 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 42% 2011-06-04 07:24:48,938 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 43% 2011-06-04 07:24:54,953 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 44% 2011-06-04 07:25:06,976 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 45% 2011-06-04 07:25:07,979 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 56% 2011-06-04 07:25:15,996 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 57% 2011-06-04 07:25:26,034 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 58% 2011-06-04 07:25:38,058 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 59% 2011-06-04 07:25:47,076 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 60% 2011-06-04 07:25:50,082 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 61% 2011-06-04 07:25:59,099 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 62% 2011-06-04 07:26:02,106 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 63% 2011-06-04 07:26:11,135 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 64% 2011-06-04 07:26:17,146 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 65% 2011-06-04 07:26:29,170 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 66% 2011-06-04 07:26:41,195 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 67% 2011-06-04 07:26:47,209 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 68% 2011-06-04 07:26:41,195 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 67% 2011-06-04 07:26:47,209 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 68% 2011-06-04 07:26:50,215 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 69% 2011-06-04 07:26:59,242 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 70% 2011-06-04 07:27:08,276 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 71% 2011-06-04 07:27:20,301 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 72% 2011-06-04 07:27:26,318 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 73% 2011-06-04 07:27:29,324 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 84% 2011-06-04 07:27:32,329 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 85% 2011-06-04 07:27:38,342 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 86% 2011-06-04 07:27:47,363 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 87% 2011-06-04 07:27:50,369 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 88% 2011-06-04 07:27:59,390 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 89% 2011-06-04 07:28:08,419 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 90% 2011-06-04 07:28:21,447 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 91% 2011-06-04 07:28:33,471 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 92% 2011-06-04 07:28:48,501 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 93% 2011-06-04 07:29:00,532 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 94% 2011-06-04 07:29:15,560 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 95% 2011-06-04 07:29:30,592 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 96% 2011-06-04 07:29:45,625 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 97% 2011-06-04 07:30:00,655 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 98% 2011-06-04 07:30:18,693 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 99% 2011-06-04 07:30:30,724 INFO org.apache.hadoop.mapred.JobClient (main): map 100% reduce 100%